×
Register Here to Apply for Jobs or Post Jobs. X

Research Intern, Agent RL Training

Job in Mountain View, Santa Clara County, California, 94039, USA
Listing for: GoTo Meeting
Full Time, Apprenticeship/Internship position
Listed on 2026-06-02
Job specializations:
  • Software Development
    Data Scientist, AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 10000 - 60000 USD Yearly USD 10000.00 60000.00 YEAR
Job Description & How to Apply Below

About News Break

Founded in 2015, News Break is the Content Intelligence platform shaping the future content economy. With over 40 million monthly active users, our flagship platform delivers highly personalized local news and information powered by advanced AI, recommendation systems, and adtech.

Recognized by Fast Company as #32 on the Top Workplaces for Innovators, we’re proud to be Great Place to Work certified and home to a dynamic team of technologists, product innovators, and business leaders who are passionate about solving meaningful challenges at scale.

Together, we reached unicorn status in 2021, and we remain committed to continuing this high-growth trajectory with the right team to fulfill our mission: building the infrastructure layer for content intelligence.

About the Role

We are looking for a Research Intern to join our Agent RL Training team. You will be paired with a full-time employee as your mentor, working together to explore, from zero to one, how to apply large language models to News Break’s core business, including content understanding, recommendation, agentic web browsing, and autonomous multi-step task completion.

This is a hands-on research role. You are expected to independently drive experiments, propose novel ideas, and iterate quickly. We value self-starters with deep intellectual curiosity and the drive to push boundaries in LLM post-training and agent capabilities.

Location: Onsite in Mountain View, CA office

What You’ll Work On
  • Collaborate with your full-time mentor to identify high-impact research directions for applying LLMs to News Break’s products
  • Independently run end-to-end SFT experiments on LLM-based agents, and assist with RL-related exploration such as reward design and training iteration
  • Curate and build high-quality training datasets: instruction-following, preference pairs, agent trajectories, and synthetic data
  • Contribute to public publications; we encourage and support top-venue submissions during your internship
Requirements
  • Highly motivated and committed: willing to put in extra hours when needed to push projects across the finish line
  • Genuine passion for research: you read papers for fun, tinker with models on weekends, and care deeply about advancing the field
  • Independently capable of end-to-end model SFT: with basic understanding of RL-based post-training methods (RLHF, DPO, PPO, GRPO, etc.)
  • Excellent taste in model behavior: able to reason about what "good" looks like across user-facing domains and articulate why
  • Strong Python and PyTorch skills
Preferred Qualifications
  • Publication at a top-tier venue (NeurIPS, ICML, ICLR, ACL, EMNLP, or equivalent)
  • Experience with multi-node distributed training (FSDP, Deep Speed, Megatron-LM)
  • Proficiency in writing custom GPU kernels with Triton or CUDA
  • Experience building synthetic data pipelines for agent training
  • Familiarity with open-source RL frameworks: TRL, OpenRLHF, veRL/vLLM

Hourly Pay: $35- $50

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary