AI Evaluation TPM — Cross-Functional Impact
Listed on 2026-05-28
-
IT/Tech
AI Engineer, Systems Engineer
About the role:
We are seeking a Technical Program Manager to lead our AI model evaluation initiatives across multiple work streams. This role will be crucial in assessing the performance, capabilities, limitations, and potential risks of our AI models. Working closely with our Research, Trust & Safety, Frontier Red teaming, and Policy teams, you will drive high-priority evaluation projects to build new processes, align metrics with policy, and track measurable progress.
You will help build and adapt the model evaluation program to ensure model deployments are rigorous and aligned with our commitment to responsible AI development. The ideal candidate will have a strong technical background and experience managing cross-functional programs in AI development, ML engineering, or related fields.
You’ll be joining a team of Technical Program Managers who own and drive cross-functional programs that align to the company’s top priorities. In this role, you’ll have the opportunity to make a foundational impact as you contribute the scaling of a centralized TPM function for the company.
Extremely strong soft skills are paramount, as our team is front and center in driving lots of company-wide changes and top priority initiatives that require generating buy-in, balancing various opinions, and competing for attention in our rapidly scaling environment.
This role is a great fit for someone who has both seen excellence at scale and operated in rapidly scaling, high-ambiguity teams and scope. We are seeking candidates with deep TPM expertise but who are comfortable acting as adaptable generalists who add value fast.
We excel at maintaining a broad view of our work but diving deep into the details when necessary. We understand business goals, translate and organize them into technical programs and projects, and drive execution. We are adept at engaging with both non-technical and technical stakeholders at all levels of the company, including executive leadership.
In this role, you will have the opportunity to shape the development of advanced AI systems and contribute to Anthropic's mission of ensuring that AI benefits all of humanity. If you are passionate about responsible AI development, have a strong technical background, and thrive in a fast-paced, collaborative environment, we'd love to hear from you.
Responsibilities:
- Partner with teams like Frontier Risk Evaluations, Security, and Trust & Safety to develop and implement comprehensive evaluation protocols for our latest frontier AI models
- Build a single source of truth for tracking all types of model evaluations as required by our Responsible Scaling Policy, AI safety institutes, the White House, and others
- Develop and maintain procedures for conducting evaluations, including designing test suites, coordinating red team exercises, and analyzing results
- Create and manage dashboards and reporting systems to track model performance, safety metrics, and evaluation outcomes across different AI systems and versions
- Lead cross-functional workshops to identify potential risks and edge cases for evaluation, ensuring thorough coverage of AI capabilities and limitations
- Coordinate with external partners and industry standards bodies to align our evaluation practices with emerging best practices in responsible AI development
- Provide detailed status reports, identifying technical risks, dependencies, and areas requiring additional support
- Facilitate communication and coordination between technical work streams and stakeholders
- Continuously identify opportunities for technical process improvements and implement changes as needed
- Stay up-to-date with the latest developments in AI safety, ML engineering, and related fields to ensure the program remains at the forefront of responsible AI development
You might be a good fit if you:
- Have several years of experience in technical program management, with a track record of successfully delivering complex technical programs, preferably in AI development, ML engineering, or related fields
- Have experience executing technical programs that require systems and engineering-level knowledge.
- Have exceptionally strong interpersonal and communication…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: