×
Register Here to Apply for Jobs or Post Jobs. X

Principal, Data Scientist | Conversational AI

Job in Hayward, Alameda County, California, 94557, USA
Listing for: Walmart
Full Time position
Listed on 2026-01-27
Job specializations:
  • IT/Tech
    AI Engineer, Data Scientist, Machine Learning/ ML Engineer, Data Analyst
Job Description & How to Apply Below
Position: (USA) Principal, Data Scientist | Conversational AI
Position Summary... What you'll do...
Overview
Walmart’s Next Gen Commerce team is building the future of conversational shopping with intelligent agents that reason, recommend, and proactively assist customers. As a Principal Data Scientist for Quality & LLM Judging Systems, you will serve as the technical lead for defining and measuring the success of these AI systems. You will be responsible for designing the "brain" that critiques our agents, utilizing a mix of LLM-as-a-judge frameworks, human benchmarks, and automated pipelines.

In this high-impact, hands-on role, you will partner closely with engineering and product leaders to translate subjective quality goals into rigorous, actionable metrics that drive model improvement and safe deployment.
Responsibilities
  • Develop Evaluation Architectures: Design and implement state-of-the-art evaluation pipelines for conversational agents using LLM-as-a-judge, and hybrid scoring frameworks.
  • Prompt Engineering & Calibration: Develop high-precision prompts for evaluator models and rigorously test them against human judgment to ensure high inter-rater reliability.
  • Model Distillation & Optimization: Lead the fine-tuning of smaller, cost-effective models to act as scalable "Judge" models, balancing trade-offs between accuracy, latency, and cost.
  • Dataset Curation: Work with large-scale conversation logs to curate "Golden Set" datasets and design annotation instructions that standardize ground truth for subjective tasks.
  • Cross-Functional Integration: Collaborate with Engineering teams to integrate quality signals into CI/CD pipelines, enabling automated regression testing and production monitoring.
  • Failure Mode Analysis: Conduct deep-dive analyses on agent failures (hallucinations, tool misuse, safety violations) and define actionable feedback loops for the modeling team.
  • Insight Discovery & Strategic Influence: Leverage evaluation data to discover systemic weaknesses and root causes, actively influencing sub-agent modeling teams and cross-functional partners to prioritize and drive targeted improvements in overall performance.
  • Thought Leadership: Mentor senior data scientists, standardize best practices for evaluation across the org, and maintain world-class credentials through patents, publications, or conference presentations.
Minimum Qualifications
  • Education: Advanced degree (Master's or PhD) in Computer Science, Statistics, Mathematics, Computational Linguistics, or a related field.
  • Experience: 7+ years of experience in Data Science or Machine Learning with a focus on NLP, Deep Learning, or AI evaluation.
  • Generative AI Expertise: Deep understanding of Large Language Models (LLMs), including prompt engineering, chain-of-thought reasoning, and instruction tuning.
  • Technical Proficiency: Solid understanding of Python and expertise with core data science packages (Num Py, Pandas, PyTorch, Scikit-learn).
  • Metric Design: Proven experience designing metrics for non-deterministic outputs (e.g., evaluating summarization, relevance, or helpfulness).
  • Engineering Fundamentals: Experience building scalable data pipelines and familiarity with distributed training/inference frameworks.
Preferred Qualifications
  • PhD in Machine Learning, NLP, or a related quantitative field.
  • Experience with conversational AI, chatbots, summarization, retrieval-augmented generation, or recommendation evaluation in an e-commerce context.
  • Knowledge of model distillation, LoRA, instruction tuning, or parameter-efficient adaptation techniques
  • Familiarity with evaluating open-ended outputs where ground truth is subjective or contextual
  • Publications, patents, or open-source contributions in LLM evaluation or applied AI
At Walmart, we offer competitive pay as well as performance-based bonus awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting.

Other benefits include short-term and long-term disability, company discounts,…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary