More jobs:
Voice Recognition Engineer – Browser- Speech Interfaces
Job in
Warner Robins, Houston County, Georgia, 31088, USA
Listed on 2026-06-02
Listing for:
New York Technology Partners
Full Time
position Listed on 2026-06-02
Job specializations:
-
IT/Tech
Technical Writer, Technical Support, AI Engineer
Job Description & How to Apply Below
Senior Technical Recruiter/Trainer @ New York Technology Partners | Resume Writer
Position Type: Contract
Location: Remote
Key Responsibilities- Develop and optimize voice recognition functionality across Chrome, Edge, Safari, Firefox, and Brave.
- Ensure consistent performance, compatibility, and user experience across desktop, laptop, mobile, and tablet environments.
- Customize and extend the Web Speech API and integrate third‑party speech frameworks, including (but not limited to):
- Eleven Labs (Scribe)
- Deepgram
- OpenAI Whisper API
- Amazon Transcribe / Polly
- Optimize recognition speed, accuracy, and robustness, especially in noisy or low‑bandwidth environments.
- Conduct benchmarking and tuning for real‑world usage scenarios across diverse accents, languages, and acoustic conditions.
- Collaborate with product and design teams to build intuitive, inclusive voice interactions.
- Support configurable speech duration thresholds and accessibility best practices for users with varying abilities.
- Partner with technical leads and product managers to align voice capabilities with product roadmap.
- Support client‑facing pilots, demos, and proof‑of‑concept initiatives.
- API Tailor:
Deep familiarity with Web Speech API and at least one major commercial speech‑to‑text platform. - Accuracy‑Focused:
Passionate about refining speech models for real‑world reliability, speed, and multilingual performance. - Collaborative Partner:
Communicates effectively with cross‑functional teams (engineering, product, UX). - Innovative Builder:
Enjoys prototyping, problem‑solving, and elevating voice interaction beyond basic transcription.
- Must have hands‑on experience with Web Speech API + at least one other commercial speech framework.
- Implement custom logic for error handling, timeout management, speech completion detection, and multilingual support.
- Minimum 3+ years of experience in speech recognition, voice UI, or audio processing.
- Demonstrated work with Web Speech API and at least one of the following:
Eleven Labs, Assembly
AI, Deepgram, OpenAI Whisper, Google Cloud STT, Azure Speech, or Amazon Transcribe. - Understanding of latency, privacy, and security considerations in client‑side voice processing.
- Experience with WebRTC, Media Recorder API, or Audio Context.
- Background in natural language understanding (NLU) or voice assistant development.
- Contributions to open‑source speech or accessibility projects.
Mid‑Senior level
Employment TypeContract
Job FunctionInformation Technology and Engineering
IndustriesResearch Services
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×