Machine Learning Research Engineer, Agents - Enterprise GenAI

Job in San Francisco, San Francisco County, California, 94199, USA

Listing for: Gravity Engineering Services Pvt Ltd.

Full Time position
Listed on 2026-06-06

Job specializations:

IT/Tech
AI Engineer (Applied/Software), Data Scientist, Machine Learning/ ML Engineer, Artificial Intelligence

Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR

About the Role

AI is becoming vitally important in every function of our society. At Scale, our mission is to accelerate the development of AI applications. For 9 years, Scale has been the leading AI data foundry, helping fuel the most exciting advancements in AI, including generative AI, defense applications, and autonomous vehicles. With our recent investment from Meta, we are doubling down on building out state of the art post‑training algorithms to reach the performance necessary for complex agents in enterprises around the world.

The Enterprise ML Research Lab works on the front lines of this AI revolution. We are working on an arsenal of proprietary research, tools, and resources that serve all of our enterprise clients. As an Agent MLRE, you will be working on applying our Agent RL Training + Building algorithms to real life enterprise datasets across our clients + benchmarks. This will involve creating best‑in‑class Agents that achieve state of the art results through a combination of post‑training + agent‑building algorithms.

If you are excited about shaping the future of the modern GenAI movement, we would love to hear from you!

Responsibilities

Train state of the art models, developed both internally and from the community, to deploy to our enterprise customers.
Research cutting edge algorithms to integrate directly into our training stack.
Build agents that leverage our proprietary agent‑building algorithms to automatically hill climb datasets – including defining highly performant tools, multi‑agent systems, and complex rewards.

Requirements

1-3 years of building with LLMs in a production environment
Experience with post‑training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc.
Publications in top conferences such as NEURIPS, ICLR, or ICML within the last two years
PhD or Masters in Computer Science or a related field

#J-18808-Ljbffr