Research Scientist,Efficient ML Systems Job Sunnyvale area,California USA,IT/Tech

Role Description

As an AI Research Scientist (Efficient ML Systems) at Goaly, you will research and build the systems that make frontier-scale models practical. This role sits at the intersection of algorithms, systems, and hardware efficiency. You will design and evaluate new training and inference techniques, prototype them in real systems, and push them to production-scale workloads.

Your work will either ship directly into our core platform or lead to publications at top venues such as NeurIPS, ICML, ICLR, or CVPR. This is not a paper-only role. You will write real systems code, run large-scale experiments, and directly shape how modern LLMs and RL systems are trained and deployed.

Core Responsibilities

Research efficient AI/ML systems
:
Invent and evaluate algorithms and system techniques that improve LLM and agentic RL training and inference efficiency (memory, compute, communication, and stability).
Scale agentic RL
:
Design and optimize large-scale agentic RL pipelines, including asynchronous training, experience management, reward modeling, and long-horizon stability.
End-to-end experimentation
:
Design large-scale experiments spanning model architecture, training algorithms, distributed systems, and hardware-aware optimization.
System-aware research
:
Prototype research ideas directly in training and inference stacks (e.g., parallelism strategies, attention kernels, RL training pipelines) and validate them at scale.
Production & publication
:
Translate successful ideas into production-ready systems and/or publish them at top-tier conferences with full internal support.

Qualifications

Ph.D. or Master's degree in CS, AI, Systems, or related fields (Exceptional undergraduates with strong research capabilities may be considered).
Strong foundation in LLM or large-scale ML training, including Transformers, attention mechanisms, distributed training, and optimization methods.
Experience or strong interest in agentic RL or large-scale reinforcement learning systems, including stability, scalability, or long-horizon training challenges.
Demonstrated interest in efficiency-focused research, such as training acceleration, memory optimization, parallelism, kernels, or RL system robustness.
Proficient in PyTorch or JAX. Clean coding style and strong command of Python.
Adaptability: A fast learner with a strong sense of responsibility, capable of wearing multiple hats and handling cross-stack challenges.

Benefits

Expert Mentorship
:
Partner with AI veterans who have trained trillion-parameter models at scale and applied it to solve real-world problems in billion-user products.
Compute Freedom
:
Access to abundant GPU cluster resources - don't let your creativity be limited by compute.
Flat Culture
:
Flat management structure that rejects office politics and values only technology and results.
Competitive Compensation
:
Competitive full-time offers, huge upside, and extra equity incentives when hitting key milestones.

#J-18808-Ljbffr

Research Scientist, Efficient ML Systems