CI Systems Engineer - AI Failure Analysis
Listed on 2026-06-04
-
Software Development
AI Engineer, Software Engineer, DevOps
Cupertino, California, United States Software and Services
Apple's Software Developer Workflows team delivers fast, reliable CI systems that make Apple's software easier to develop and ship. We believe that streamlined development unlocks creativity, innovation, and potential for developers. We're looking for a skilled CI Systems Engineer to join our team and help build intelligent AI‑assisted systems that enable Apple's OS engineers to understand, diagnose, and resolve build and test failures value diverse perspectives and unique skills.
More than specific experience, we seek an engineer passionate about building great software, applying AI thoughtfully to real problems, continuous learning, and solving complex technical challenges.
In this role, you will design and maintain infrastructure for collecting, processing, and analyzing massive volumes of CI results data – and you will integrate AI capabilities that transform how developers interact with failure information. Your work will turn complex failure signals into actionable insights that accelerate development, using AI to summarize failures, separate genuine issues from distractions, and help engineers focus on what needs attention.
Success requires flexibility, proactivity, and thriving in a supportive environment with challenging problems. You'll need excellent judgment for timely technical decisions, ability to collaborate effectively on design discussions, and strong technical depth to make informed tradeoffs about when and how AI can genuinely improve developer workflows. In your role as a CI Systems Engineer, you will work at the intersection of AI and developer tools, shaping how thousands of engineers across Apple diagnose failures.
You'll have the autonomy to evaluate new AI approaches, influence infrastructure architecture, and see your work directly reduce friction in Apple's software development process.
- Develop AI‑assisted failure analysis systems that transform raw CI data into actionable insights, helping developers quickly diagnose root causes and understand test failure patterns
- Design and implement AI‑powered triage workflows that intelligently summarize failures, identify patterns across large result sets, and distinguish signal from noise
- Build and integrate tools that give AI systems structured access to CI data, enabling intelligent querying and analysis
- Optimize data structures and database design for fast storage and deduplication of build and test failures, ensuring AI systems have efficient access to the context they need
- Drive performance improvements and optimization initiatives for results storage and query latency to meet developer needs
- Collaborate with OS engineering teams across platforms (iOS, macOS, etc.) to understand diagnostic needs and refine AI‑assisted analysis capabilities
- Implement observability and alerting for the CI results infrastructure itself, ensuring reliability and detecting systemic failure patterns
- Evaluate and iterate on AI approaches, measuring their effectiveness at reducing triage time and improving developer experience
- Share technical knowledge and best practices with team members on failure analysis, AI integration patterns, data systems design, and infrastructure challenges
- BS in Computer Science or equivalent professional experience
- 8+ years of software engineering experience, preferably 2+ years focused on CI infrastructure, data systems, or failure analysis
- Experience applying AI/ML or LLM‑based approaches to software development workflows, tooling, or automation
- Proficiency in one or more languages suited to systems and data work (Swift, Scala, Python, Go, C/C++, etc.)
- Proven ability to work independently on complex problems and collaborate effectively on team initiatives
- Strong communication skills to collaborate with diverse teams and translate complex failure data into developer‑friendly insights
- Demonstrated experience in designing or contributing to systems that handle scale, data integrity, and query performance
- Experience building or integrating with AI agents using the latest‑available tools such as…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).