Senior Platform Engineer, Voice AI
Listed on 2026-05-30
-
Software Development
Software Engineer
About The Role
Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability.
We're looking for a Senior Platform Engineer to own the API and infrastructure layer for voice workloads. You'll build the real-time Web Socket and HTTP APIs that developers use to ship voice experiences, design autoscaling for latency-sensitive streaming workloads, and ensure our multi-provider voice platform is reliable enough for production voice agents handling millions of calls.
This is a foundational hire on a small, high-impact team. Voice APIs have fundamentally different infrastructure requirements than text-based inference — bidirectional audio streaming, stateful connections, tight latency SLOs, and complex multi-model routing. You'll define how developers interact with Together's voice platform as we grow from early customers to the default infrastructure for voice AI.
- Own the real-time API layer (Web Socket + HTTP streaming) that powers Together's voice platform.
- Design autoscaling and orchestration for voice workloads running on tens of thousands of GPUs.
- Build the developer experience — APIs, observability, and tooling — for a fast-growing product area.
- Work with production voice customers (contact centers, AI agents, communication platforms) to ship what they actually need.
- Join a small, early-stage team with outsized impact on a new product line.
- Build and harden real-time Web Socket and HTTP streaming APIs for STT and TTS — including connection lifecycle management, back pressure, error handling, and reconnection, at the reliability bar needed for production voice agents.
- Design and ship autoscaling for voice model endpoints that handles bursty, real-time traffic patterns — accounting for concurrent connection limits, streaming state, and hard latency ceilings.
- Implement voice-specific API features: word-level alignment, speaker diarization in realtime, audio format flexibility (g711/mulaw for telephony, PCM, WebRTC formats), pronunciation controls, and multi-context Web Socket support.
- Build voice-specific observability — latency breakdowns, audio quality signals, and dashboards that help both the team and customers debug issues.
- Own multi-model normalization across our model partners (Cartesia, Deepgram, Rime, and others), ensuring consistent API behavior regardless of the underlying provider.
- Collaborate with the ML engineering side of the team on the interface between the API layer and the model serving stack, ensuring latency and reliability requirements are met end-to-end.
- Contribute to developer experience — API design, documentation, integration cookbooks, playground and showcasing how best-in-class voice agents are built.
- Lay the groundwork for multiple new products down the line.
- 5+ years of experience building large-scale, real-time distributed systems and API services.
- Deep expertise in real-time streaming infrastructure — Web Socket server architecture, Server-Sent Events, bidirectional streaming, connection multiplexing, and stateful protocol design.
- Expert-level programming in Type Script and Python; experience with Rust is a plus.
- Strong distributed systems fundamentals: load balancing, autoscaling, rate limiting, and traffic shaping for latency-sensitive workloads.
- Experience with Kubernetes — including custom autoscalers, resource management, and health checking for stateful services.
- Strong product sense — you care about API ergonomics and think about what developers building voice apps actually need.
- Comfort working on a small, early-stage team where you'll wear multiple hats and move fast.
- Experience with audio or media protocols (WebRTC, g711, PCM encoding) is a strong plus.
- Familiarity with ML model serving infrastructure and how inference engines work is a plus — you'll interface with the serving layer regularly.
- Full-stack experience (React, Next.js) is a nice-to-have for contributing to developer-facing tooling.
- Bachelor's or Master's degree in Computer Science, Computer…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).