×
Register Here to Apply for Jobs or Post Jobs. X

Applied Reinforcement Learning Engineer

Job in Redmond, King County, Washington, 98052, USA
Listing for: Centific
Full Time position
Listed on 2026-05-25
Job specializations:
  • IT/Tech
    Data Scientist, AI Engineer (Applied/Software), Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 150000 - 300000 USD Yearly USD 150000.00 300000.00 YEAR
Job Description & How to Apply Below

Location:
Palo Alto, CA or Seattle, WA (Hybrid/Remote)

Salary: $150K – $300K Annually

About Centific

Centific is a frontier AI data foundry that curates diverse, high-quality data, using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe, scalable AI deployment. Our team includes more than 150 PhDs and data scientists, along with 4,000+ AI practitioners and engineers, and an integrated ecosystem of 1.8 million vertical domain experts across 230+ markets.

Our zero-distance innovation™ solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster.

About the Team

Centific AI Research advances foundational AI models and applications through reinforcement learning, alignment, and human-centered intelligence. We're building governed simulation environments that let enterprises safely iterate and improve AI agent workflows — bridging human-labeled signal creation with automated post-training for high-stakes operations.

The Role

You’ll build simulation environments that mirror real enterprise workflows and post-train LLM agents inside them. Your environments, reward functions, and verifiers become the training ground for production agents handling document processing, compliance, customer operations, and multi-step reasoning across regulated industries.

This role sits at the intersection of LLM post-training research and production engineering. You’ll translate customer workflows into bespoke environments, design reward signals that hold up under optimization pressure, and ship pipelines that turn human-labeled traces into measurable agent improvements.

What You’ll Do
  • Design simulation environments and digital twins for enterprise workflows
  • Post-train LLM agents using the right method for the task — RLHF, DPO, GRPO, PPO, and whatever comes next
  • Build pipelines that turn human-labeled traces and verifiable signals into training data
  • Architect multi-turn, tool-using agents with closed learning loops
  • Design reward functions and verifiers that resist reward hacking and reflect real task outcomes
  • Translate research into production; contribute to publications
Required Qualifications
  • 3+ years fine-tuning LLMs, with hands-on experience in RL post-training
  • Experience building or training LLM-based agents — tool use, multi-turn reasoning, trajectory evaluation
  • Strong Python and software engineering skills; comfortable building pipelines, not just notebooks
  • Working knowledge of modern post-training and rollout-serving libraries
  • MS/PhD in CS, ML, or related field, or equivalent industry experience
Preferred Qualifications
  • Publications at NeurIPS, ICML, ICLR, ACL, COLM, or similar venues
  • Open-source contributions to post-training or agent frameworks (TRL, veRL, OpenRLHF, SkyRL, or similar)
  • Background in classical RL
  • Domain experience in healthcare, finance, logistics, or compliance
  • Experience with synthetic data generation, simulation, or world models
  • Distributed training experience
Why Join Centific
  • Lead the frontier. Shape a new discipline at the intersection of post-training, simulation, and enterprise AI
  • Ship your science. See your research power real systems across healthcare, finance, and safety-critical operations
  • Collaborate with leaders. Work alongside NVIDIA, Microsoft, and the global AI community
  • Build what matters. Create governed, compliant AI systems enterprises can actually trust

Learn more about us at

Centific is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, ancestry, citizenship status, age, mental or physical disability, medical condition, sex (including pregnancy), gender identity or expression, sexual orientation, marital status, familial status, veteran status, or any other characteristic protected by applicable law. We consider qualified applicants regardless of criminal histories, consistent with legal requirements.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary