×
Register Here to Apply for Jobs or Post Jobs. X

Senior​/ML Engineer, Apple Ray, Apple Data Platform

Job in Cupertino, Santa Clara County, California, 95014, USA
Listing for: Apple Inc.
Full Time position
Listed on 2026-02-22
Job specializations:
  • Software Development
    Machine Learning/ ML Engineer, AI Engineer
Salary/Wage Range or Industry Benchmark: 181100 - 318400 USD Yearly USD 181100.00 318400.00 YEAR
Job Description & How to Apply Below
Position: Senior / Staff ML Engineer, Apple Ray, Apple Data Platform

Cupertino, California, United States Software and Services

The Apple Ray team is seeking a Senior / Staff Software Engineer with strong distributed systems expertise and a solid background in machine learning. In this hybrid role, you will design and build core components of Apple’s unified data+ML platform powered by open-source Ray, while also partnering with ML teams to ensure the platform meets the needs of large-scale training and inference workloads.

You will contribute to the distributed runtime, orchestration layer, and system APIs that power Apple’s intelligent features across products and services. This role is ideal for a software engineer who enjoys low-level systems work but is also fluent in ML workflows and models at scale.

Description

Apple Ray integrates deeply with Apple’s data and ML ecosystem to provide a unified platform for building, orchestrating, and scaling complex ML and data pipelines. As a Software Engineer with ML background, you will design distributed systems that support large-scale model training, tuning, and inference across heterogeneous compute environments—from bare-metal GPU clusters to cloud-native infrastructure. You will build features that enhance developer productivity for ML engineers, improve resource efficiency, and advance the performance and reliability of Apple’s ML workloads.

You’ll collaborate closely with ML practitioners to translate model and pipeline needs into robust platform capabilities, while also improving the underlying distributed runtime and control plane. This role requires strong engineering fundamentals, hands‑on experience with ML systems, and a passion for building scalable infrastructure.

Responsibilities
  • Build scalable distributed systems and platform components using Ray that power Apple’s data+ML workflows.
  • Develop APIs, libraries, and services that improve the efficiency and usability of large-scale ML training and inference pipelines.
  • Optimize performance and resource utilization across GPU/CPU clusters for ML workloads running at Apple scale.
  • Collaborate with ML teams to understand model and pipeline needs and translate them into robust platform features.
  • Design fault‑tolerant orchestration mechanisms, autoscaling strategies, and runtime improvements for distributed ML jobs.
  • Diagnose complex issues across distributed systems and ML pipelines to ensure reliability and availability.
  • Improve observability, monitoring, and debugging capabilities targeted at ML‑centric distributed workloads.
  • Contribute to architectural decisions and, where appropriate, upstream enhancements to Ray and related tools.
Minimum Qualifications
  • 6+ years building distributed systems, high‑scale backend services, or compute runtimes.
  • Solid background in ML workflows, model training, model serving, or data pipeline development.
  • Proficiency in Python, plus strong experience in a systems‑level language (C++, Rust, Go, or Java).
  • Experience with ML frameworks such as PyTorch or Tensor Flow and familiarity with GPU‑based training.
  • Understanding of parallelism strategies, model scaling, or distributed training concepts.
  • Experience with cluster orchestration (Kubernetes, EKS, GKE) or large‑scale compute systems.
  • Strong debugging skills across distributed and ML‑centric runtime environments.
  • Ability to work cross‑functionally with ML engineers, data engineers, and infrastructure teams.
  • B.S., M.S., or Ph.D. in Computer Science, Machine Learning, or related technical fields — or comparable software engineering experience.
Preferred Qualifications
  • Experience with distributed training frameworks (Deep Speed, Horovod, FSDP, ZeRO).
  • Background in optimizing GPU workloads or performance benchmarking.
  • Experience with model orchestration systems or ML platforms.
  • Contributions to open‑source ML or distributed systems projects.
  • Familiarity with large‑scale data systems such as Spark, Flink, or similar.

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop within a role. The base pay range for this role is between $181,100 and $318,400, and your base pay will depend on your skills,…

Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary