Voice AI Engineer - Applied AI Job Sunnyvale area,California USA,IT/Tech

Position: Staff Voice AI Engineer - Applied AI
About the Role:

Applied AI at Uber builds intelligent systems that power next-generation product experiences for riders, drivers, merchants, and couriers. As a Staff Voice AI Engineer, you will lead the design and deployment of large-scale, real-time Voice AI systems that enable natural, reliable, and intelligent voice interactions across Uber's ecosystem.

You will operate as a full-stack technical leader across speech modeling, LLM-powered conversational intelligence, and low-latency backend infrastructure - owning Voice AI systems end-to-end, from model development and evaluation to highly available, distributed production services. This includes advancing capabilities in automatic speech recognition (ASR), text-to-speech (TTS), spoken language understanding, and LLM-driven dialogue systems.

You will partner closely with product, design, and infrastructure teams to translate customer pain points into seamless voice-first experiences - setting the foundation for how Voice AI is built, deployed, and operated across Uber's global platform.

What You Will Do:

* Design and build end-to-end Voice AI solutions, from understanding customer pain points and defining product requirements to deploying LLM-powered, real-time voice interfaces in production.

* Benchmark and evaluate voice AI systems, including speech recognition, speech synthesis, and spoken language understanding, by designing evaluations, analyzing results, and identifying systematic weaknesses.

* Improve voice model performance through system prompt tuning, fine-tuning voice- and speech-specific models, and optimizing architectures for low-latency, real-time voice interactions.

* Analyze voice request logs, prompt traces, and audio inputs to diagnose failure modes, improve transcription accuracy, conversational quality, and overall user experience.

* Build and maintain internal tools and platforms to automate Voice AI workflows, such as large-scale transcription pipelines, real-time audio processing services, and evaluation harnesses for voice quality.

* Own Voice AI systems in production end-to-end, including rollout strategies, monitoring, alerting, quality regression detection, and on-call readiness.

* Collaborate closely with product, design, and research teams to translate user needs into Voice AI capabilities with measurable business and customer impact.

Basic Qualifications:

* 10+ years of experience in software engineering, data science, or machine learning, including a track record of shipping production AI systems.

* Deep understanding of large language models, including fine-tuning, prompt engineering, embeddings, and retrieval-augmented generation (RAG).

* Strong backend and distributed systems expertise, with experience designing and operating highly available, scalable services in production.

* Deep experience with ML infrastructure, including model training pipelines, online serving systems, feature stores, experiment platforms, and evaluation frameworks.

* Hands-on experience with distributed data processing systems (e.g., Spark, Flink, Ray) and workflow orchestration (e.g., Airflow or equivalent).

* Ability to analyze data, run experiments, and derive insights for model and product improvement.

* Excellent communication and collaboration skills across technical and non-technical teams.

Preferred Qualifications:

* Experience building evaluation frameworks for Voice AI, including metrics and human/LLM-assisted evaluations for speech recognition accuracy, latency, robustness, and naturalness of synthesized speech.

* Demonstrated expertise in machine learning fundamentals applied to voice, including model evaluation, training, and fine-tuning of ASR, TTS, or speech-language models.

* Proven experience deploying Voice AI systems to production, with an emphasis on low-latency, high-reliability, real-time environments.

* Experience writing developer documentation, creating voice-specific SDKs, or enabling internal teams to build on shared Voice AI platforms.

* Hands-on work with large-scale audio datasets, including data curation, labeling strategies, and optimization of voice processing pipelines at scale.

For San Francisco, CA-based roles:
The base salary range for this role is USD $232,000 per year - USD $258,000 per year.

For Sunnyvale, CA-based roles:
The base salary range for this role is USD $232,000 per year - USD $258,000 per year.

For all US locations, you will be eligible to participate in Uber's bonus program, and may be offered an equity award & other types of comp. All full-time employees are eligible to participate in a 401(k) plan. You will also be eligible for various benefits. More details can be found at the following link []().

Uber's mission is to reimagine the way the world moves for the better. Here, bold ideas create real-world impact, challenges drive growth, and speed fuels progress. What moves us, moves the world - let's move it forward, together.

Uber is proud to be an Equal Opportunity employer. All qualified applicants will receive…


Increase/decrease your Search Radius (miles)



Job Posting Language