Research Scientist, Efficient ML Systems
Job in
Sunnyvale, Santa Clara County, California, 94087, USA
Listed on 2026-06-13
Listing for:
Goaly AI
Full Time
position Listed on 2026-06-13
Job specializations:
-
IT/Tech
AI Engineer (Applied/Software), Machine Learning/ ML Engineer, Data Scientist
Job Description & How to Apply Below
Role Description
As an AI Research Scientist (Efficient ML Systems) at Goaly, you will research and build the systems that make frontier-scale models practical. This role sits at the intersection of algorithms, systems, and hardware efficiency. You will design and evaluate new training and inference techniques, prototype them in real systems, and push them to production-scale workloads.
Your work will either ship directly into our core platform or lead to publications at top venues such as NeurIPS, ICML, ICLR, or CVPR. This is not a paper-only role. You will write real systems code, run large-scale experiments, and directly shape how modern LLMs and RL systems are trained and deployed.
Core Responsibilities- Research efficient AI/ML systems
:
Invent and evaluate algorithms and system techniques that improve LLM and agentic RL training and inference efficiency (memory, compute, communication, and stability). - Scale agentic RL
:
Design and optimize large-scale agentic RL pipelines, including asynchronous training, experience management, reward modeling, and long-horizon stability. - End-to-end experimentation
:
Design large-scale experiments spanning model architecture, training algorithms, distributed systems, and hardware-aware optimization. - System-aware research
:
Prototype research ideas directly in training and inference stacks (e.g., parallelism strategies, attention kernels, RL training pipelines) and validate them at scale. - Production & publication
:
Translate successful ideas into production-ready systems and/or publish them at top-tier conferences with full internal support.
- Ph.D. or Master's degree in CS, AI, Systems, or related fields (Exceptional undergraduates with strong research capabilities may be considered).
- Strong foundation in LLM or large-scale ML training, including Transformers, attention mechanisms, distributed training, and optimization methods.
- Experience or strong interest in agentic RL or large-scale reinforcement learning systems, including stability, scalability, or long-horizon training challenges.
- Demonstrated interest in efficiency-focused research, such as training acceleration, memory optimization, parallelism, kernels, or RL system robustness.
- Proficient in PyTorch or JAX. Clean coding style and strong command of Python.
- Adaptability: A fast learner with a strong sense of responsibility, capable of wearing multiple hats and handling cross-stack challenges.
- Expert Mentorship
:
Partner with AI veterans who have trained trillion-parameter models at scale and applied it to solve real-world problems in billion-user products. - Compute Freedom
:
Access to abundant GPU cluster resources - don't let your creativity be limited by compute. - Flat Culture
:
Flat management structure that rejects office politics and values only technology and results. - Competitive Compensation
:
Competitive full-time offers, huge upside, and extra equity incentives when hitting key milestones.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×