×
Register Here to Apply for Jobs or Post Jobs. X

Engineering Manager - ML Platform and Infrastructure

Remote / Online - Candidates ideally in
Sunnyvale, Santa Clara County, California, 94087, USA
Listing for: Applied Intuition Inc.
Remote/Work from Home position
Listed on 2026-02-21
Job specializations:
  • IT/Tech
    Systems Engineer, AI Engineer, Data Engineer, Machine Learning/ ML Engineer
  • Engineering
    Systems Engineer, AI Engineer, Data Engineer
Salary/Wage Range or Industry Benchmark: 204000 - 343000 USD Yearly USD 204000.00 343000.00 YEAR
Job Description & How to Apply Below

Engineering Manager - ML Platform and Infrastructure

Sunnyvale, California, United States

About Applied Intuition

Applied Intuition, Inc. is powering the future of physical AI. Founded in 2017 and now valued at $15 billion, the Silicon Valley company is creating the digital infrastructure needed to bring intelligence to every moving machine on the planet. Applied Intuition services the automotive, defense, trucking, construction, mining and agriculture industries in three core areas: tools and infrastructure, operating systems, and autonomy.

Eighteen of the top 20 global automakers, as well as the United States military and its allies, trust the company’s solutions to deliver physical intelligence. Applied Intuition is headquartered in Sunnyvale, California, with offices in Washington, D.C.;
San Diego;
Ft. Walton Beach, Florida;
Ann Arbor, Michigan;
London;
Stuttgart;
Munich;
Stockholm;
Bangalore;
Seoul; and Tokyo. Learn more.

We are an in‑office company; employees are expected to primarily work from the Applied Intuition office five days a week. We also recognize the importance of flexibility and trust our employees to manage their schedules responsibly, including occasional remote work.

About the role

As an Engineering Manager on the ML Platform team, you’ll lead a world‑class group of engineers focused on building the infrastructure that powers Physical AI r team will own three critical areas:
Training & Inference Orchestration, GPU Cluster Architecture, and Performance Optimization. You’ll work at the intersection of systems engineering and ML, partnering directly with stack development and research teams to remove bottlenecks and accelerate the path from experimentation to production.

At Applied Intuition, you will:
  • Grow and manage a team of world‑class infrastructure and systems engineers with the goal of delivering a best‑in‑class ML platform for Physical AI
  • Own the design and evolution of frameworks for orchestrating distributed training and inference jobs across thousands of GPUs
  • Drive the build‑out and scaling of our GPU cluster infrastructure, making critical decisions on architecture, scheduling, networking, and resource management
  • Lead efforts to optimize training and inference performance—including throughput, fault tolerance, GPU utilization, and cost efficiency at scale
  • Set team goals and roadmap in alignment with research milestones, model development timelines, and production deployment requirements
  • Partner closely with research, stack development, and infrastructure teams to understand their workflows and accelerate their iteration speed
  • Drive hiring, mentoring, and growth for a high‑performing, mission‑driven team
We’re looking for someone who has:
  • 3+ years of engineering management experience, ideally leading infrastructure or platform teams
  • Passion for building and leading high‑performing teams that operate at the frontier of scale
  • Deep experience with distributed systems, GPU computing, or large‑scale ML infrastructure
  • Direct experience building or operating large GPU clusters (1,000+ GPUs)
  • Strong understanding of distributed training frameworks (e.g., PyTorch Distributed, Megatron‑LM, Deep Speed, FSDP) and job orchestration at scale
  • Familiarity with GPU cluster management, high‑performance networking (Infini Band, RDMA), and resource scheduling (Slurm, Kubernetes)
  • Track record of building and operating systems that run reliably at massive scale
Nice to have:
  • Background in training optimization techniques such as mixed‑precision training, pipeline/tensor/data parallelism, or checkpointing strategies
  • Experience with inference optimization (batching, model serving, quantization, compiler‑level optimizations)
  • Familiarity with Physical AI domains such as autonomous driving, robotics, or simulation
  • Contributions to open‑source ML infrastructure projects
Compensation

Base salary ranges for this full‑time position in Sunnyvale, California, are $204,000 – $343,000 USD annually. Compensation also includes equity, comprehensive health, dental, vision, life and disability insurance, 401(k) retirement benefits with employer match, learning and wellness stipends, and paid time off.

Equal Opportunity Employer

Applied Intuition is an equal‑opportunity employer and federal contractor or subcontractor. We comply with all applicable laws and regulations prohibiting discrimination and ensuring affirmative action. All qualified individuals, including protected veterans and individuals with disabilities, are encouraged to apply. We are committed to diversity, equity, and inclusion in all aspects of our workforce.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary