Digital Success Associate Evaluations Manager
Listed on 2026-02-16
-
IT/Tech
Data Analyst, AI Engineer
Job Category
Program & Project Management
Job DetailsAssociate Evaluations Manager on the Digital Success Data & AI team.
About SalesforceSalesforce is the #1 AI CRM, where humans with agents drive customer success together. Here, ambition meets action. Tech meets trust. And innovation isn't a buzzword – it's a way of life. The world of work as we know it is changing and we're looking for Trailblazers who are passionate about bettering business and the world through AI, driving innovation, and keeping Salesforce's core values at the heart of it all.
Ready to level‑up your career at the company leading workforce transformation in the agentic era? You're in the right place! Agentforce is the future of AI, and you are the future of Salesforce.
As an Associate Evaluations Manager on the Digital Success Data & AI team, you will help measure, monitor, and improve the performance of Agentforce on Salesforce Help. You'll run ongoing synthetic evaluations across Answer Quality, latency, instructions adherence, and other core capabilities, translating results into clear, actionable insights that inform product decisions, operational improvements, and leadership reporting.
You will contribute to operational excellence by building clear documentation, repeatable processes, and scalable evaluation frameworks that increase confidence in agent performance. This role partners closely with Support Delivery, Engineering, Operations, and Data Science to evaluate new features, identify risks and opportunities, and ensure quality at launch.
You will also help evolve our evaluation ecosystem – refining LLM judge prompts and ground truths – while identifying opportunities to increase automation, reliability, and efficiency. Success in this role requires strong analytical rigor, curiosity, and ownership: someone comfortable working in ambiguity, solving problems end‑to‑end, and consistently delivering clear, defensible insights that drive action.
Your Impact- Support the Agentforce baselining program, using synthetic and automated tooling to continuously measure and improve performance.
- Analyze evaluation results independently, identifying root causes, surfacing trends, and translating insights into actionable recommendations for models, implementations, and processes.
- Maintain and evolve evaluation frameworks, scoring rubrics, and guidelines to ensure consistent, defensible, and scalable assessments.
- Deliver clear, influential reporting and business reviews that inform stakeholders and drive product and operational decisions.
- Define, monitor, and interpret key evaluation metrics, proactively identifying risks, regressions, and improvement opportunities.
- Enable internal partners on evaluation processes and findings, building trust and shared understanding across teams.
- Strengthen the evaluation feedback loop across automated testing, LLM‑judge prompts, and golden datasets to continuously improve testing sophistication.
- Perform targeted evaluations for new features and urgent initiatives, ensuring quality and market readiness.
- Audit and refine the utterance repository to keep testing relevant, high quality, and aligned with evolving product capabilities.
- Synthesize customer and internal feedback into actionable insights, helping shape product direction and operational improvements.
- Advocate for tooling, process, and workflow improvements that increase evaluation efficiency, scalability, and reliability.
- Proactively surface risks and partner on mitigations, ensuring issues are addressed before they impact customers.
- 1+ years of professional experience working in Salesforce environments (program, analyst, operations, or product context). Demonstrated ability to take ownership of tasks and drive outcomes independently.
- Strong analytical mindset: comfortable reviewing conversational AI outputs, identifying failure patterns, conducting root cause analysis, and translating findings into actionable recommendations.
- Operational rigor and attention to detail: able to execute repeatable evaluation workflows accurately and consistently in a fast‑paced, ambiguous environment.
- Clear written communication skills: able to document…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).