×
Register Here to Apply for Jobs or Post Jobs. X

Sr. Software Engineer- AI​/ML, AWS Neuron Distributed Training

Job in Cupertino, Santa Clara County, California, 95014, USA
Listing for: Amazon Web Services (AWS)
Apprenticeship/Internship position
Listed on 2026-02-03
Job specializations:
  • Software Development
    Machine Learning/ ML Engineer, AI Engineer, Data Scientist
Job Description & How to Apply Below

Overview

Annapurna Labs designs silicon and software that accelerates innovation. Our custom chips, accelerators, and software stacks enable us to tackle unprecedented technical challenges and deliver solutions that help customers change the world. AWS Neuron is the complete software stack powering AWS Trainium (Trn2/Trn3), and we are seeking a Senior Software Engineer to join our ML Distributed Training team.

Responsibilities
  • Design, implement and optimize distributed training solutions for large scale ML models running on Trainium instances. A significant part of your work will involve extending and optimizing popular distributed training frameworks including FSDP (Fully-Sharded Data Parallel), torch titan and Hugging Face libraries for the Neuron ecosystem.
  • Develop and optimize mixed-precision and low-precision training techniques using BF16, FP8, and emerging numerical formats to maximize training throughput while maintaining model accuracy and convergence quality. Implement precision-aware training strategies, loss scaling techniques, and careful gradient management to ensure training stability across reduced precision formats.
  • Profile, analyze, and tune end-to-end training pipelines to achieve optimal performance on Trainium hardware. Partner with hardware, compiler, and runtime teams to influence system design and unlock new capabilities. Work directly with AWS solution architects and customers to deploy and optimize training workloads at scale.
Qualifications
  • Bachelor's degree in computer science or equivalent
  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems
  • 5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Experience in machine learning, large scale training with LLMs and expertise in Py Torch
Preferred Qualifications
  • Master's degree in computer science or equivalent
  • Experience in computer architecture
  • Previous software engineering expertise with PyTorch/Jax/Tensor Flow, distributed libraries and frameworks, end-to-end model training

Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.

Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit (Use the "Apply for this Job" box below). for more information. If the country/region you’re applying in isn’t listed, please contact your Recruiting Partner.

The base salary range for this position is listed below. Your Amazon package will include sign-on payments and restricted stock units (RSUs). Final compensation will be determined based on factors including experience, qualifications, and location. Amazon also offers comprehensive benefits including health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage), 401(k) matching, paid time off, and parental leave.

Learn more about our benefits at .

USA, CA, Cupertino -  -  USD annually

Job : A3168219

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary