Research Engineer,Frontier Capabilities Job Cambridge area,Massachusetts USA,Software Development

About Lila

Lila Sciences is the world’s first scientific superintelligence platform and autonomous lab for life, chemistry, and materials science. We are pioneering a new age of boundless discovery by building the capabilities to apply AI to every aspect of the scientific method. We are introducing scientific superintelligence to solve humankind’s greatest challenges, enabling scientists to bring forth solutions in human health, climate, and sustainability at a pace and scale never experienced before.

Learn more about this mission a.ai.

If this sounds like an environment you’d love to work in, even if you only have some of the experience listed below, we encourage you to apply.

Your Impact at Lila

The AI Research team is tackling one of the most exciting, open problems in AI: training LLMs to run long-horizon scientific discovery tasks. Our approach spans the full post-training stack— from SFT to asynchronous RL on agentic harnesses— teaching models to plan, use tools, and learn from experience in domains where the ground truth isn’t a preference label but a scientific result.

We’re rapidly growing our Research Engineering org and seeking talented engineers and ML practitioners across levels to design, build, and optimize systems to push this frontier: scaling post-training, sharpening reasoning, and unlocking compute-intensive agentic-harness training. This is a rare chance to join an early team with the autonomy, flexibility, and compute to tackle frontier science problems.

Work Streams

Stream A: GPU Optimization & Training Performance

Maximize hardware utilization across 100B+ parameter asynchronous RL training runs. Responsibilities include profiling, performance optimization, custom kernel development, communication-computation overlap, and long-context throughput improvements. You set and maintain the performance baseline.

Stream B:
Stack & Infrastructure

Own the post-training infrastructure end‑to‑end—supervised fine‑tuning, asynchronous RL with tool integration, and data pipelines. Build modular, reproducible workflows with single‑command execution. Manage upstream framework upgrades and deliver composable pipelines spanning Data, SFT, and RL stages. You work tightly with Research Scientists to develop and product ionize novel algorithms to run at scale.

Stream C:
Model Experimentation

Bring deep, hands‑on experience training large language models. Lead experimentation on reasoning model development, including mixture‑of‑experts stabilization, curriculum design, and synthetic reasoning trace generation. You have a bias toward experimental design and tracking, and know how to prioritize runs that yield promising outcomes.

Stream D:
Evaluations & Benchmarks

Design and build best‑in‑class scientific agentic benchmarks and harnesses, along with the dashboards and leaderboards that inform every training decision. You have experience working with well‑known public benchmarks and have spent time building bespoke agentic benchmarks and harnesses.

Stream E:
Agentic Capabilities & Frontier Research

Train models capable of planning, exploration, and tool use over extended horizons. Advance the state of the art in RL at scale with tool‑calling, subgoal decomposition, and shared memory/skills across trials to expand the frontier of scientific agent capabilities.

What You’ll Need to Succeed

Strong software engineering skills in Python; C++/CUDA a plus
Experience with distributed ML training frameworks (Megatron‑LM, Torch Titan, Deep Speed, Ray)
Understanding of large‑scale model training techniques for 100B+ models
Experience with cloud or HPC environment
Ability to communicate technical results to internal and external stakeholders

Bonus Points For

Prior work with large scale scientific datasets or domain‑specific modeling
Contributions to open‑source ML frameworks
Experience with RL post‑training (RLHF, GRPO, tool‑augmented RL)
Experience training MoE architectures

Location

San Francisco, CA or Cambridge, MA (Remote, Hybrid, and On‑Site available depending on team needs).

Compensation

We offer competitive base compensation with bonus potential and generous early‑stage equity. Your final offer will reflect your background,…

Research Engineer, Frontier Capabilities