Data Scientist; Remote
Sunnyvale, Santa Clara County, California, 94087, USA
Listed on 2026-06-19
-
IT/Tech
AI Engineer (Applied/Software), Machine Learning/ ML Engineer
Crowd Strike, Inc.
Full time
R29082
As a global leader in cybersecurity, Crowd Strike protects the people, processes and technologies that drive modern organizations. Since 2011, our mission hasn't changed — we're here to stop breaches, and we've redefined modern security with the world's most advanced AI-native platform. Our customers span all industries, and they count on Crowd Strike to keep their businesses running, their communities safe and their lives moving forward.
We're also a mission-driven company. We cultivate a culture that gives every Crowd Striker both the flexibility and autonomy to own their careers. We're always looking to add talented Crowd Strikers to the team who have limitless passion, a relentless focus on innovation and a fanatical commitment to our customers, our community and each other. Ready to join a mission that matters?
The future of cybersecurity starts with you.
The Data Science team is expanding and is looking for a Data Scientist to help build the next generation of agentic systems for cybersecurity. Crowd Strike's cybersecurity data is one-of-a-kind: we process nearly a trillion behavioral events per day. You'll work where Machine Learning, Big Data, and Cybersecurity converge — training models, building AI agents, and rigorously measuring whether they work — on data and problems you won't find anywhere else.
WhatYou'll Do:
- Work at the intersection of Artificial Intelligence and Threat Research
- Work closely with subject-matter experts in cybersecurity to understand analyst workflows and their security operations procedures
- Post-train LLMs and agents — supervised fine‑tuning and reinforcement learning (RLHF/RLAIF, PPO/GRPO/DPO, reward modeling) — to automate analyst procedures and improve reliability on real security tasks
- Devise AI agents and combine them into increasingly complex workflows: planning and reasoning loops, tool and function calling, and retrieval and memory
- Research new approaches to agentic planning, and prototype state‑of‑the‑art methods from the literature
- Establish objective criteria for benchmarking agentic systems — evals, LLM‑as‑judge pipelines, and trajectory‑level metrics, with real statistical rigor
- Optimize prompts and inference to get the most out of every model
- Collaborate and coordinate across Engineering, Data Science, and Managed Services teams, and partner with engineers to take prototypes toward production
- Keep track of developments in the field of Artificial Intelligence and help identify, define, and prioritize areas for research
- Excellent foundations in machine learning, probability, and statistics, with sound instincts for uncertainty, statistical skew/variance, and experimental design
- PhD-level depth of understanding in modern machine learning research — a doctorate itself is not required, but we expect equivalent mastery, including the ability to read, critique, implement, and improve upon current papers
- Experience training generative models, with a strong command of LLM training fundamentals (architecture, optimization, tokenization, data, and scaling behavior)
- Reinforcement learning / post‑training as a core skill: RLHF/RLAIF, policy optimization (PPO/GRPO/DPO), reward modeling, and building RL environments for agents
- Experience building agentic systems: agent architectures (ReAct, planning, reflection), tool and function calling, and retrieval/memory/context management
- Experience with systematic prompt optimization, and with designing and building evals for LLM systems
- Fluency with GPUs, PyTorch, and the common LLM training and serving stack (e.g., Hugging Face Transformers/TRL/PEFT, Deep Speed/FSDP, vLLM/TGI/SGLang)
- Strong, reproducible research engineering: clean Python and disciplined experiment tracking that your collaborators can build on
- Ability to work independently on ambiguous and complex objectives, and to communicate clearly within a large project team
- Experience generating training data and environments — synthetic data, agent trajectories/rollouts, and task simulators
- Familiarity with inference‑time scaling / test‑time compute (search, self‑consistency, verifier‑guided…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).