ML/LLM Operations Engineer
Listed on 2026-02-28
-
IT/Tech
AI Engineer, Data Analyst, Data Scientist
Your Future Evolves Here
Evolent partners with health plans and providers to achieve better outcomes for people with most complex and costly health conditions. Working across specialties and primary care, we seek to connect the pieces of fragmented health care system and ensure people get the same level of care and compassion we would want for our loved ones.
Evolent employees enjoy work/life balance, the flexibility to suit their work to their lives, and autonomy they need to get things done. We believe that people do their best work when they're supported to live their best lives, and when they feel welcome to bring their whole selves to work. That's one reason why diversity and inclusion are core to our business.
Join Evolent for the mission. Stay for the culture.
What You’ll Be Doing:We are seeking a skilled ML/LLM Operations Engineer to join our Data Science team at Evolent Health to ensure our AI systems deliver consistent, reliable, and compliant results in healthcare settings. This role is perfect for someone who thrives at the intersection of machine learning, operations, and healthcare compliance.
The role combines deep understanding of LLM behavior and evaluation with a meticulous approach to monitoring, quality assurance, and regulatory compliance in healthcare applications.
Collaboration Opportunities:This position will play a critical role partnering with our Data Science and Engineering teams while also interacting with cross-functional organizations including Dev Ops, Compliance, Quality Assurance, Clinical Support, and Product Management to ensure our AI systems operate reliably and meet all healthcare industry requirements.
What You Will Be Doing:- Develop and maintain standardized evaluation frameworks to consistently measure LLM performance across relevant healthcare metrics
- Build monitoring systems using Logfire to track AI model performance, detect drift, and alert the team to anomalies
- Create testing infrastructure for prompt versions, model selection, and quality assurance processes
- Design and implement audit sampling processes for continuous quality monitoring and clinical review workflows
- Oversee regulatory compliance processes, including documentation for bias assessments, model cards, and audit trails required in healthcare
- Optimize LLM operations through intelligent model selection, prompt engineering, and cost management strategies
- Support the transition from successful POCs to production-ready services with appropriate testing and validation
- Partner with Dev Ops on infrastructure requirements while focusing on AI-specific monitoring and optimization
- Create and maintain documentation, runbooks, and operational procedures for all deployed AI systems
- Collaborate with Clinical Support Liaison to incorporate clinical feedback into system improvements
- Prepare regular reports on AI system quality, performance metrics, and compliance status
- Bachelor's or master's degree in computer science, data science, or related field
- 2+ years of experience with Python development and at least one production LLM implementation
- Strong proficiency in SQL for complex log analysis and metrics generation
- Demonstrated experience with LLM APIs and frameworks (experience with Pydantic
AI, Lang Chain, or similar) - Experience with monitoring tools and practices for AI systems, including performance metrics, drift detection, and alerting
- Understanding of LLM behavior, prompt engineering, and common failure modes in production
- Experience building evaluation or testing frameworks for AI/ML systems
- Strong communication skills for cross-functional collaboration
- Experience with healthcare AI applications and compliance requirements is preferred
- Familiarity with multiple LLM providers (OpenAI, Anthropic, Google, Azure) is preferred
- Knowledge of Pydantic ecosystem including Pydantic
AI and Logfire is preferred - Understanding of LLM evaluation metrics and methodologies is preferred
- Experience building tools for non-technical users is preferred
- Basic knowledge of containerization (Docker) for local testing and development is preferred
- Experience with cloud environments (AWS, Azure) as a user is preferred
- Understand…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).