×
Register Here to Apply for Jobs or Post Jobs. X

Sr Software Engineer, Machine Learning Platform Technologies – Cloud Infrastructure

Job in Cupertino, Santa Clara County, California, 95014, USA
Listing for: Apple Inc.
Full Time position
Listed on 2026-01-06
Job specializations:
  • Software Development
    AI Engineer, Cloud Engineer - Software
Job Description & How to Apply Below

AIML - Sr Software Engineer, Machine Learning Platform Technologies

Cupertino, California, United States Machine Learning and AI

Are you an open-source contributor passionate about building the next generation of cloud‑native ML infrastructure? We're seeking a hands‑on technical leader with deep expertise in Kubernetes, Crossplane, Golang/Rust, and agentic workflows to design and scale the platforms that power Apple's Siri, Search, and AI/ML ecosystems. If you've contributed to CNCF projects such as Crossplane, ArgoCD, or Kubernetes, and you're driven to build infrastructure for ML training and inference—including optimizing for performance, cost, and automation—this role is for you.

You'll architect at Apple scale, developing intelligent, declarative, and self‑managing infrastructure that enables billions of seamless user experiences.

Description

Our MLPT Cloud Infrastructure Team within Apple's AI/ML organization designs, builds, and scales the foundational systems that power Siri, Search, and next‑generation ML workloads. We're reimagining how infrastructure is managed—through agentic, event‑driven workflows, Crossplane compositions, and self‑healing control planes—to deliver Model Context Protocol (MCP)‑based infrastructure servers that integrate seamlessly with ML and data workflows. You'll work closely with AI/ML engineers, SREs, and platform teams to deliver infrastructure that is automated, observable, and efficient across Apple‑scale hybrid and multi‑cloud environments.

Responsibilities
  • Architect and develop cloud‑native, agentic infrastructure platforms supporting ML training, inference, and large‑scale distributed systems.
  • Lead and mentor engineers building Crossplane‑based control planes, Kubernetes operators, and ArgoCD‑driven Git Ops automation.
  • Design, build, and optimize Model Context Protocol (MCP) servers that manage and contextualize infrastructure and application state across environments.
  • Contribute to and upstream improvements in open‑source CNCF projects, representing Apple in the cloud‑native community.
  • Implement observability, governance, and automation frameworks to ensure performance, reliability, and compliance.
  • Collaborate with AI/ML and infrastructure teams to integrate agentic orchestration workflows for self‑service provisioning, ML pipeline management, and dynamic scaling.
  • Drive best practices for Git Ops, IaC, and Kubernetes cluster lifecycle automation at global scale.
  • Ensure systems are resilient, secure, and optimized for cost and performance across on‑prem and multi‑cloud environments.
Minimum Qualifications
  • BS/MS in Computer Science or related field (or equivalent practical experience).
  • 5+ years of experience in distributed systems or cloud infrastructure engineering.
  • Strong programming experience in Golang and/or Rust; expertise in building controllers, operators, or automation systems.
  • Deep understanding of Kubernetes internals, controller‑runtime, and Crossplane composition frameworks.
  • Experience with ArgoCD, Helm, and Infrastructure‑as‑Code (Terraform, Pulumi, or Crossplane).
  • Hands‑on experience with Git Ops, declarative configuration, and reconciliation‑driven workflows.
  • Proven ability to design and operate infrastructure for ML training and inference, including performance tuning and GPU optimization.
  • Experience leading technical teams, driving architecture decisions, and mentoring engineers.
  • Strong grounding in cloud cost efficiency, performance profiling, and system‑level debugging.
Preferred Qualifications
  • 9+ years in cloud infrastructure, SRE, or distributed systems roles.
  • Active contributor to CNCF open‑source projects (e.g., Kubernetes, Crossplane, ArgoCD, Envoy, Prometheus).
  • Deep expertise in Kubernetes API machinery, custom resources (CRDs), and control plane development.
  • Experience with Model Context Protocol (MCP)–based systems or contextual orchestration servers.
  • Familiarity with AIOps or agentic AI workflows in production environments.
  • Strong understanding of observability, telemetry, and distributed tracing (Open Telemetry, Prometheus, Grafana).
  • Excellent communication, technical writing, and cross‑functional leadership skills.

At Apple, base pay is one part of…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary