×
Register Here to Apply for Jobs or Post Jobs. X

Senior Manager of Meta Evaluation & Quality Assurance

Job in Seattle, King County, Washington, 98127, USA
Listing for: Apple Inc.
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    Data Scientist, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 60000 - 80000 USD Yearly USD 60000.00 80000.00 YEAR
Job Description & How to Apply Below

Senior Manager of Meta Evaluation & Quality Assurance

Seattle, Washington, United States Software and Services

Apple Services Engineering (ASE) powers many AI and LLM features across App Store, Music, Video and more. As these systems increasingly rely on human-in-the-loop evaluation systems, the quality of our decisions is constrained by the quality of our evaluation systems. We believe that to build exceptional LLMs, you need exceptional mechanisms to validate the signals used to train and evaluate them.

Description

As the Senior Manager of Meta Evaluation & Quality Assurance, you will lead a specialized team of Data Scientists and Machine Learning Engineers who evaluate the evaluators. You will move beyond basic validation to lead the strategy and technical development of ML-based validation frameworks and automated data quality validation pipelines. You will set strategy, guide execution, and work cross functionally to deliver a cohesive quality system that combines machine learning with human-in-the-loop processes to ensure our metrics are trustworthy, robust, and decision-ready.

Responsibilities
  • Lead, mentor, and develop a multidisciplinary team of Data Scientists and Machine Learning Engineers, fostering a culture of rigorous scientific inquiry, technical excellence, and accountability.
  • Define and drive the strategic roadmap for Meta Evaluation methodology and standards across Apple Services.
  • Oversee the development of ML-based quality validation systems. You will guide ML Engineers in building models that utilize human-in-the-loop workflows to audit evaluators, identifying anomalies, disagreement, and ambiguity in evaluation data.
  • Establish data quality validation standards and define the statistical processes for measuring confidence, calibration, and inter-rater reliability.
  • Partner with Model Engineering and Data Science teams to validate new AI Judges (autograders) and Agents pre-production, ensuring they meet prescribed performance standards before deployment.
  • Collaborate with Operations teams to build active learning loops where human experts adjudicate discrepancies flagged by your validation models, creating a continuous cycle of system improvement.
  • Monitor the health of the evaluation ecosystem, identifying risks such as evaluator drift, bias, or silent agent failures, and reporting decision-readiness signals to leadership.
  • Stay current with industry best practices in evaluation science, active learning, and hybrid human-AI quality control, bringing innovative validation methods to Apple’s evaluation stack.
  • This is a highly collaborative leadership position that requires working across Engineering, Quality, Training, and Production Ops. Most of all, you are able to manage and lead change effectively while maintaining Apple culture and standards. Interpersonal skills, strategic thinking, and technical product knowledge are essential for success in this role.
Minimum Qualifications
  • 8+ years of experience in Data Science, Machine Learning, or Evaluation Science, with 3+ years leading technical teams
  • Strong background in Meta Evaluation, AI/ML measurement, statistics, or quality assurance methodologies.
  • Demonstrated success in designing Human-in-the-Loop (HITL) machine learning systems or active learning pipelines.
  • Masters degree in Statistics, Data Science, Machine Learning or related field.
Preferred Qualifications
  • PhD in Statistics, Computer Science, Machine Learning, or related field
  • Deep understanding of evaluation pipelines, calibration techniques, and statistical process control
  • Experience building ML models specifically designed for quality estimation, anomaly detection, or disagreement modeling
  • Proficiency in Python or R for statistical analysis and reasoning about evaluation data
  • Experience defining governance gates or certification processes for AI systems
  • Proven ability to manage complex methodological and technical programs in dynamic, fast-paced environments.
  • Exceptional communication, organizational, and analytical skill.

At Apple, base pay is one part of our total compensation package and is determined within a range. This provides the opportunity to progress as you grow and develop…

Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary