Member of Technical Staff - ML Research Engineer; Multi-Modal - Audio
Listed on 2026-01-26
-
IT/Tech
Systems Engineer, Data Engineer, AI Engineer
About Liquid AI
Spun out of MIT CSAIL, we build general-purpose AI systems that run efficiently across deployment targets, from data center accelerators to on-device hardware, ensuring low latency, minimal memory usage, privacy, and reliability. We partner with enterprises across consumer electronics, automotive, life sciences, and financial services. We are scaling rapidly and need exceptional people to help us get there.
The OpportunityOur Audio team is building frontier speech-language models that handle STT, TTS, and speech-to-speech in a single architecture. This role sits at the center of applied audio model development, working directly with the technical lead to ship production systems that run on-device under real-time constraints. You will own critical work streams across data pipelines, evaluation systems, and customer deployments. If you want high ownership on rare technical problems in a small, elite team where your code ships, this is the role.
WhatWe're Looking For
We need someone who:
Builds first, theorizes later: You ship working systems, not just notebooks. Production-grade code is your default, not a stretch goal.
Owns outcomes end-to-end: From data pipelines to customer deployments, you take responsibility for the full stack without waiting for someone else to handle the hard parts.
Thrives under constraints: On-device, low-latency, memory-limited systems excite you. You see constraints as design parameters, not blockers.
Ramps quickly on new territory: Gaps in specific subdomains are fine if you close them fast. You seek out feedback and stay focused on what moves the needle.
Build and scale data pipelines for audio model training, including preprocessing, augmentation, and quality filtering at scale
Design, implement, and maintain evaluation systems that measure multimodal performance across internal and public benchmarks
Fine-tune and adapt audio models for customer-specific use cases, owning delivery from requirements through deployment
Contribute production code to the core audio repository, collaborating with infrastructure and research teams
Support experimentation under real hardware constraints, shifting between customer work and core development as priorities evolve
Must-have:
Strong programming fundamentals with demonstrated ability to write clean, maintainable, production-grade code
Experience building and shipping production ML systems beyond model training (data pipelines, evals, serving infrastructure)
Proficiency in PyTorch and familiarity with distributed training frameworks (Deep Speed, FSDP, or similar)
Track record of collaborating effectively in shared codebases with high engineering standards
Nice-to-have:
Direct experience with audio/speech models (ASR, TTS, vocoders, diarization, or speech-to-speech systems)
Experience designing and running large-scale training experiments on distributed GPU clusters
Open-source contributions that demonstrate code quality and engineering judgment
Within 6 months, you independently deliver production-ready data pipelines or evaluation systems and own at least one customer workstream end-to-end
Your PRs to the core audio repo are accepted without heavy rework, demonstrating strong judgment in system design
By year end, you operate as a second pillar to the technical lead, unblocking parallel work streams and raising overall team velocity
Rare technical problems: Work on audio-to-audio frontier systems with real ownership in a team small enough that your contributions ship directly to production.
Compensation: Competitive base salary with equity in a unicorn-stage company
Health: We pay 100% of medical, dental, and vision premiums for employees and dependents
Financial: 401(k) matching up to 4% of base pay
Time Off: Unlimited PTO plus company-wide Refill Days throughout the year
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).