×
Register Here to Apply for Jobs or Post Jobs. X

ML Infrastructure Service Reliability Engineer- Apple Services Engineering

Job in Cupertino, Santa Clara County, California, 95014, USA
Listing for: Apple Inc.
Per diem position
Listed on 2026-05-25
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

ML Infrastructure Service Reliability Engineer
- Apple Services Engineering

At Apple, we don’t just build products — we create transformative experiences that have reshaped entire industries. Our innovation is driven by the diversity of our people and their ideas, inspiring everything we do. Imagine the impact you could make. Join Apple and help us leave the world better than we found it.

The ML Infrastructure team is responsible for managing Apple’s largest ML compute platform, multi‑cloud storage abstraction and caching platform, which supports critical machine learning training workloads that power user‑facing features across the Apple ecosystem. Operating across both first‑party and third‑party cloud environments brings complex and unique challenges.

As a Site Reliability Engineer (SRE) on the ML Infrastructure team, you’ll be expected to address these challenges through a strong foundation in cloud object storage, data analysis, automation, collaboration, and advanced expertise in Kubernetes. Our team oversees the full infrastructure stack — from low‑level nodes to the complete network architecture — ensuring our platform remains highly available, resilient, and efficient at scale.

Description

We are seeking an experienced Software and Systems Engineer to join our dynamic team. This role demands a proactive mindset, technical excellence, and a collaborative spirit.

The ideal candidate will demonstrate:

  • Strong critical thinking and a high degree of individual accountability
  • Effective communication and collaboration skills
  • A genuine passion for Infrastructure as a Service (IaaS)
  • A commitment to automation and operational efficiency
  • Ownership of projects from design through delivery
  • A solutions‑oriented approach, coupled with the ability to gain alignment on technical direction
  • Consistent and timely execution of design implementations aligned with project objectives
  • The ability to provide constructive technical feedback, fostering team‑wide growth and continuous improvement
Responsibilities
  • Participate in a rotating on‑call schedule, including occasional weekend coverage when necessary.
  • Leverage a diverse stack of open‑source tools, commercial solutions, and internally developed systems to deliver robust services.
  • Encourage open dialogue, value strong ideas, and recognize impactful results within the team.
  • Collaborate across teams to support global operations across time zones.
Minimum Qualifications
  • 5+ years of experience building, operating, and scaling large applications in private, public, or hybrid cloud environments.
  • Deep expertise in Kubernetes, with hands‑on experience using platforms such as EKS.
  • Proficiency in designing, developing, and releasing code in languages such as Python, Go, or Rust.
  • Practical experience with object storage technologies, including Amazon S3.
  • Strong background in designing and troubleshooting complex networking issues in both public and private cloud infrastructures.
  • Solid understanding of Linux internals, standard networking protocols, and distributed systems architecture.
Preferred Qualifications
  • Proven drive to automate manual operations and enhance processes through a strong understanding of best practices for deploying large‑scale, distributed applications.
  • Hands‑on experience managing diverse system environments using configuration management tools or software delivery platforms such as Spinnaker, Helm, or Flux.
  • Demonstrated expertise in deploying, supporting, and monitoring both new and existing services, platforms, and application stacks.
  • Solid familiarity with container orchestration and management using Kubernetes and related tooling.

At Apple, we believe accessibility is a fundamental human right. You’ll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong. Learn about accessibility in Apple’s workplace.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary