×
Register Here to Apply for Jobs or Post Jobs. X

Senior AI Observability Engineer - Cloud-Native & ML Pipelines

Job in Cupertino, Santa Clara County, California, 95014, USA
Listing for: Apple Inc.
Full Time position
Listed on 2026-05-27
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer, Cloud Engineer - Software, Software Engineer
Salary/Wage Range or Industry Benchmark: 212000 - 318400 USD Yearly USD 212000.00 318400.00 YEAR
Job Description & How to Apply Below

Senior Software Engineer - AI Observability - AI, Search & Knowledge Platform

Cupertino, California, United States Software and Services

Do you want to build the future of AI enabled observability at Apple? We're looking for an experienced AI observability engineer to design and build AI observability solutions that power Apple Intelligence, Search, and AI infrastructure powering Apple's intelligent products. We're at the forefront of building AI-first observability services, blending AI, cloud-first engineering, and industry standards to deliver smart, scalable solutions.

Your work will directly impact the experience of billions of users on their favorite Apple devices. If you are a seasoned principal or senior software engineer with a proven track record in building AI enabled observability solutions and have a deep passion for observability, AI, cloud-native technologies and large-scale distributed systems, we want to talk with you.

Description

The AI, Search & Knowledge Platform Cloud Infrastructure Team within Apple’s Services organization designs, builds, and scales the foundational systems that power Search and next-generation machine learning workloads. We're pioneering the next generation of AI-powered observability solutions. While we innovate to build new solutions, we also leverage industry-standard open-source technologies. In this role, you will collaborate with a team of engineers to lead the design and development of user-facing observability features for AIML products and infrastructure.

You will also be responsible for providing technical guidance, sharing observability best practices and know-how, leveraging AI pipelines and mentoring the team to develop and deliver best-of-class features and a delightful user experience for all users.

Minimum Qualifications
  • 7+ years of software engineering experience building and operating large-scale, cloud-native, distributed systems and microservices in public cloud infrastructure and/or "private cloud" environments
  • 7+ years of software engineering experience and strong background in computer science: distributed systems, algorithms and data structures, APIs and highly-scalable, reliable systems and micro-services
  • Demonstrated experience using LLM and ML models for AIOps and model observability
  • Hands on experience building ML pipelines, portable workflows and in model tuning to deploy ML and LLM models in production for customer-facing features
  • Hands on experience using LLMs, ML frameworks, i.e. Tensor Flow, PyTorch and libraries like Scikit-learn, Num Py, Lang Chain, MLFlow, Kube Flow
  • Experience building services for Observability Analysis, including anomaly detection, incident detection, automated remediation, and root-cause analysis
  • Excellent verbal and written communication, problem solving, and cross-team collaboration skills, including with open source communities
Preferred Qualifications
  • Knowledge of current Gen AI research and techniques: MCPs, RAG systems, Agentic AI (multi-agent orchestration, tool calling)
  • Hands‑on experience with agentic AI frameworks (e.g. Lang Graph, Auto Gen, CrewAI) for building multi‑step reasoning and tool‑using agents
  • Experience designing multi-agent orchestration, tool‑calling, or RAG systems for operational/diagnostic workflows
  • Demonstrated proficiency operating workloads on public and/or private cloud platforms, Kubernetes, object storage, networking, databases, and observability services
  • Demonstrated experience in building observability systems for metrics, distributed tracing, logs, profiling
  • Experience with large scale observability visualization tools like Grafana, Data Dog, and ELK
  • Building large-scale incident management, alert management and notification systems
  • Active contributions to CNCF or open source projects (e.g., k8sGPT, Holmes

    GPT, kagent, Open Telemetry, Prometheus)

Base pay for this role ranges from $212,000 to $318,400, depending on skills, qualifications, experience, and location. Apple employees also have the opportunity to become shareholders through discretionary employee stock programs and may purchase Apple stock at a discount through the Employee Stock Purchase Plan. Benefits include…

Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary