Research Engineer,Models Training Job Seattle area,Washington USA,IT/Tech

Research Engineer, Reward Models Training

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About the Role

Reward models are a critical component of how we align our AI systems with human values and preferences, serving as the bridge between human feedback and model behavior. In this role, you'll build the infrastructure that enables us to train reward models efficiently and reliably, scale to increasingly large model sizes, and incorporate diverse forms of human feedback across multiple domains and modalities.

You will own the end-to-end engineering of reward model training at Anthropic.

You’ll work at the intersection of machine learning systems and alignment research, partnering closely with researchers to translate novel techniques into production-grade training pipelines. This is a high-impact role where your work directly contributes to making Claude more helpful, harmless, and honest.

Note:

For this role, we conduct all interviews in Python.

Responsibilities

• Own the end-to-end engineering of reward model training, from data ingestion through model evaluation and deployment

• Design and implement efficient, reliable training pipelines that can scale to increasingly large model sizes

• Build robust data pipelines for collecting, processing, and incorporating human feedback into reward model training

• Optimize training infrastructure for throughput, efficiency, and fault tolerance across distributed systems

• Extend reward model capabilities to support new domains and additional data modalities

• Collaborate with researchers to implement and iterate on novel reward modeling techniques

• Develop tooling and monitoring systems to ensure training quality and identify issues early

• Contribute to the design and improvement of our overall model training infrastructure

You may be a good fit if you:

• Have significant experience building and maintaining large-scale ML systems

• Are proficient in Python and have experience with ML frameworks such as Py Torch

• Have experience with distributed training systems and optimizing ML workloads for efficiency

• Are comfortable working with large datasets and building data pipelines at scale

• Can balance research exploration with engineering rigor and operational reliability

• Enjoy collaborating closely with researchers and translating research ideas into reliable engineering systems

• Are results-oriented with a bias towards flexibility and impact

• Can navigate ambiguity and make progress in fast-moving research environments

• Adapt quickly to changing priorities, while juggling multiple urgent issues

• Maintain clarity when debugging complex, time-sensitive issues

• Pick up slack, even if it goes outside your job description

• Care about the societal impacts of your work and are motivated by Anthropic's mission

Strong candidates may also have experience with

• Training or fine-tuning large language models

• Reinforcement learning from human feedback (RLHF) or related techniques

• GPUs, Kubernetes, and cloud infrastructure (AWS, GCP)

• Building systems for human-in-the-loop machine learning

• Working with multimodal data (text, images, audio, etc.)

• Large-scale ETL and data processing frameworks (Spark, Airflow)

Representative projects

• Scaling reward model training to handle models with significantly more parameters while maintaining training stability

• Building a unified data pipeline that ingests human feedback from multiple sources and formats for reward model training

• Implementing fault-tolerant training infrastructure that gracefully handles hardware failures during long training runs

• Developing evaluation frameworks to measure reward model quality across diverse domains

• Optimizing training throughput to reduce iteration time on reward modeling experiments

The expected base compensation for this position is below. Our total compensation package for full-time employees includes equity,…


Increase/decrease your Search Radius (miles)



Job Posting Language

Research Engineer, Models Training