×
Register Here to Apply for Jobs or Post Jobs. X

AI Researcher, Product Evaluation, Vision Products Group

Job in Seattle, King County, Washington, 98127, USA
Listing for: Apple Inc.
Full Time position
Listed on 2026-06-04
Job specializations:
  • IT/Tech
    AI Engineer, Data Scientist
Salary/Wage Range or Industry Benchmark: 134800 - 245800 USD Yearly USD 134800.00 245800.00 YEAR
Job Description & How to Apply Below
Position: AI Experience Researcher, Product Evaluation, Vision Products Group

AI Experience Researcher, Product Evaluation, Vision Products Group

Seattle, Washington, United States Machine Learning and AI

We are seeking a highly motivated and analytical AI Experience Researcher to join our team. This role blends cognitive and human sciences, data sciences, systems design, and product evaluation to ensure AI-powered products deliver exceptional and intuitive customer experiences. You will work alongside a small but impactful team, collaborating with ML and data scientists, software engineers, designers, project managers, and other cross-functional teams at Apple to define success criteria for AI experiences, and create rigorous evaluations that measure these criteria in iterative product development cycles.

If you're passionate about applying scientific rigor to real-world problems, thrive on innovation, and want your work to impact hundreds of millions of users, this role offers an exceptional opportunity to make a lasting contribution to products people use every day.

Description

The central challenge of this role is figuring out what "good" means for an AI experience, and then designing rigorous evaluations that measure those qualities reliably and s requires both deep theoretical grounding in human experience and a solid analytical mindset to operationalize that understanding into scalable evaluation frameworks. Leaning on research in human sciences, you will decompose complex AI interactions into their constituent parts, reason about how those parts interact, and build evaluation frameworks that hold up under the scrutiny of the non-deterministic nature of AI experiences and the pressures of iterative product development.

You will derive experimental designs, create golden data sets, write tests, and turn them into prompts for LLM judges or instructions for human raters. You will run automated evaluations, analyze results, and present findings to diverse stakeholders. Candidates who bring both quantitative rigor and a qualitative sensibility — to recognize patterns in model behaviors and outputs, and to develop an interpretive understanding of what the data is and isn't capturing from a human perspective — will thrive in this role.

What matters most is the ability to hold both orientations at once — to think carefully about what makes an experience work, and to measure complex human dimensions with precision. We are also looking for someone who is excited to co‑create what this discipline looks like going forward — bringing intellectual curiosity and a point of view about where human-centered AI evaluation should be headed.

Responsibilities
  • Develop scalable automated evaluation methodologies by operationalizing complex multi-modal multi-turn AI experiences into observable and measurable metrics that work across diverse use cases, features, or product area
  • Produce comprehensive evaluation plans detailing evaluation scope, validation and data strategy, tooling requirements, resource allocation, and timelines
  • Derive experimental designs and write test instructions for LLM judges or for human raters
  • Define requirements for, or curate datasets that represent realistic usage; support data generation and annotation workflows to ensure coverage, quality, and alignment with product goals
  • Implement and analyze automated evaluations, maintaining rigor around reproducibility, identifying key insights, and areas for improvement across both qualitative and quantitative patterns
  • Prepare and present clear, concise, and impactful evaluation findings to diverse stakeholders, translating results into actionable recommendations for model training, ranking, and product decisions
  • Partner with engineers, QA, data scientists, designers, and product managers throughout the product development lifecycle to integrate evaluation insights and drive continuous improvement
  • Contribute to evolving human-centered AI evaluation methodologies and help to define best practices for AI experience evaluation as the field matures
Minimum Qualifications
  • Advanced degree in Cognitive Psychology, Human‑Computer Interaction (HCI), User Experience (UX) Research, Learning Sciences, Learning Analytics, Psychometrics,…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary