Evals Lead,scale speech ai Job Cardiff area,Wales UK,IT/Tech

Position: Evals Lead, large scale speech ai

Lead Evaluation Engineer — Speech & Multimodal Models

How do you measure whether an AI voice truly sounds real — and prove it with data?

You’ll join an AI team developing large-scale speech and multimodal systems for real-time interaction — models that generate, clone, and understand voice with natural expression and precision.

This is a founding evaluation role, in a new dedicated Evals team defining how these models are measured, improved, and deployed safely ’ll design objective and subjective evaluation pipelines, run large-scale human studies, and build automated systems that turn perception into measurable signal.

Your work will span every stage of model development — from research to production — collaborating with speech, audio, and ML teams to close the loop between modelling, feedback, and user experience.

What you’ll do

Build and scale evaluation pipelines for TTS, voice conversion, and ASR systems
Design human studies for subjective testing (e.g. MOS, ABX)
Define and implement objective metrics (WER, intelligibility, naturalness, prosody)
Automate evaluation dashboards and reporting systems
Train auxiliary models to capture new evaluation dimensions
Collaborate across data, model, and product teams to drive measurable improvement
Establish and scale the evaluation function as the team grows

You’ll bring

Strong experience building or running eval systems for speech or multimodal models
Familiarity with ASR, TTS, or voice cloning pipelines
Experience designing user studies or subjective model evaluation
Solid understanding of statistics and experimental design
Proficiency in Python and ML frameworks (PyTorch, Hugging Face, etc.)
Strong communication skills and cross-functional mindset

Why this role

This is a rare chance to build the evaluation foundation for models already deployed globally — shaping how next-generation speech systems are measured and improved. You’ll have the autonomy to define standards, lead future hires, and see your work directly impact millions of real-world interactions.

Fully remote (EU timezones preferrred), global team. Competitive salary + meaningful stock options.

The company are well funded, with a 9 figure funding round and significant runway for meaningful growth, lots of compute and hiring!

Apply today. Everyone will get a response.

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language

Evals Lead, scale speech ai