Member of Technical Staff - Research Engineer, Frontier AI Robotics
Listed on 2026-06-12
-
Software Development
AI Engineer (Applied/Software), Software Engineer, Machine Learning/ ML Engineer, Robotics
About the role
At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through frontier foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real‑world scenarios.
What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's computational infrastructure and rich real‑world datasets to train and deploy state‑of‑the‑art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real‑world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations.
Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world‑class researchers, and seeing your innovations deployed at unprecedented scale.
Key Job Responsibilities- Design, implement, and optimize distributed training systems that scale across thousands of GPUs and nodes for large‑scale training workloads.
- Develop high‑performance optimizations to maximize throughput and efficiency.
- Develop reusable frameworks and libraries to improve training reproducibility, reliability, and scalability for new model architectures.
- Establish standards for reliability, maintainability, and security, ensuring systems are robust under rapid iteration.
- Collaborate with researchers to influence model architectures for optimal hardware utilization.
- Develop comprehensive benchmarking frameworks to measure and optimize model performance.
- Optimize transformer blocks using custom CUDA kernels and Tensor
RT optimization techniques. - Partner with scientists to analyze model architectures and propose efficiency improvements.
- Implement and benchmark various optimization strategies for large‑scale models.
- Debug performance bottlenecks using NVIDIA profiling tools.
- Participate in technical discussions about new model architectures with the science team.
- Manage pre/post training runs and continue improving system stability and throughput.
- Prototype new acceleration approaches using emerging compilation frameworks.
- 5+ years of non‑internship professional software development experience, or a Bachelor's degree in computer science or equivalent.
- 5+ years of programming with at least one software programming language.
- 5+ years of leading design or architecture (design patterns, reliability, and scaling) of new and existing systems.
- Experience as a mentor, tech lead, or leading an engineering team.
- Expertise in Python, PyTorch, and CUDA programming.
- Experience with Tensor
RT or similar ML optimization frameworks. - Ability to optimize ML models for production.
- Experience working directly with research teams.
- Experience working on distributed training for foundation models to make them stable, reliable, and performant.
- 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience.
- Track record of improving research productivity through infrastructure design or process improvements.
- Experience in ML infrastructure such as Pytorch, Megatron, Torch Titan, etc.
- Experience with ML compilers (ONNX Runtime, TVM, etc.).
- Experience with transformer model optimization or RL framework development.
- Background in performance profiling and optimization.
- Track record of building robust monitoring systems.
- Experience with large‑scale ML serving systems.
Amazon is an equal‑opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Los Area County applicants:
Job duties for this position include: work safely and cooperatively with other employees, supervisors, and staff; adhere to standards of excellence…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).