Audio AI Engineer Job Overland Park area,Kansas USA,IT/Tech

Join to apply for the Audio AI Engineer role at Propio Language Services

Propio is on a mission to make communication accessible to everyone. As a leader in real-time interpretation and multilingual language services, we connect people with the information they need across language, culture, and modality. We’re committed to building AI-powered tools to enhance interpreter workflows, automate multilingual insights, and scale communication quality across industries.

Job Type: Full-time

We are hiring an Audio AI Engineer that will develop and optimize end-to-end systems that enable real-time, high-fidelity speech-to-speech interpretation s role focuses on seamlessly connecting speech recognition, translation, and synthesis technologies to create natural, low-latency interpretation experiences.

Key Responsibilities

Design and optimize end-to-end Speech-to-Speech pipelines that integrate ASR, translation, and TTS with minimal latency
Build bidirectional interpretation systems that handle turn‑taking, speaker identification, and context preservation across language boundaries
Collaborate with the Audio/Speech Engineer to optimize latency, quality, and robustness of speech components in the full pipeline
Work with the Staff ML Engineer to design efficient inference architectures and deployment strategies for real-time streaming systems
Develop streaming ASR and TTS systems capable of handling continuous, overlapping speech in interpretation scenarios
Benchmark and optimize latency across all pipeline stages (speech capture, recognition, translation, synthesis)
Integrate speaker diarization, acoustic environment adaptation, and speech enhancement into interpretation workflows
Partner with linguists and product teams to validate interpretation quality and gather domain‑specific feedback

Requirements Qualifications

Bachelor's or Master’s Degree in Electrical Engineering, Computer Science, or related field
3+ years of experience in speech processing, audio engineering, or conversational AI systems
Deep expertise in ASR, TTS, and streaming audio architectures
Proficiency in Python, ML frameworks, and experience with real‑time signal processing
Experience building low‑latency production systems and optimizing for inference performance
Strong understanding of interpretation workflows, multilingual challenges, and speech quality metrics

Preferred Qualifications

Experience building speech-to-text pipelines or hybrid ASR + LLM systems
Familiarity with real-time audio processing or latency-sensitive applications

Seniority Level

Mid‑Senior level

Employment type

Full-time

Job Function

Engineering and Information Technology

Industries

Translation and Localization

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language