×
Register Here to Apply for Jobs or Post Jobs. X

AI Research Engineer- Speech

Job in Redmond, King County, Washington, 98052, USA
Listing for: Centific Global Solutions, Inc.
Full Time position
Listed on 2026-06-01
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 150000 - 160000 USD Yearly USD 150000.00 160000.00 YEAR
Job Description & How to Apply Below
Position: AI Research Engineer- Speech 1

About Centific

Centific is a frontier AI data foundry that curates diverse, high-quality data, using our purpose-built technology platforms to empower the Magnificent Seven and our enterprise clients with safe, scalable AI deployment. Our team includes more than 150 PhDs and data scientists, along with more than 4,000 AI practitioners and engineers. We harness the power of an integrated solution ecosystem—comprising industry-leading partnerships and 1.8 million vertical domain experts in more than 230 markets—to create contextual, multilingual, pre‑trained datasets;

fine‑tuned, industry‑specific LLMs; and RAG pipelines supported by vector databases. Our zero‑distance innovation™ solutions for GenAI can reduce GenAI costs by up to 80% and bring solutions to market 50% faster. Our mission is to bridge the gap between AI creators and industry leaders by bringing best practices in GenAI to unicorn innovators and enterprise customers. We aim to help these organizations unlock significant business value by deploying GenAI at scale, helping to ensure they stay at the forefront of technological advancement and maintain a competitive edge in their respective markets.

About

Job

Job Description AI Engineer:
Speech/Audio

Key Responsibilities
  • Design, develop, and deploy Large Audio Language Models (LALMs) capable of native audio understanding, reasoning, and generation.
  • Build Large Audio Reasoning Models that perform complex chain‑of‑thought reasoning over speech and audio inputs, including medical, technical, and conversational domains.
  • Contribute to Speech‑to‑Speech (S2S) system development, including speech understanding, dialogue management, and speech synthesis components.
  • Research and implement alignment mechanisms between speech encoders and LLM backbones using lightweight adapters, LoRA, and efficient fine‑tuning strategies.
  • Design efficient speech tokenization and temporal compression techniques suitable for long‑form audio reasoning and multi‑turn spoken dialogue.
  • Build comprehensive evaluation frameworks for audio reasoning capabilities, including benchmarks for speech QA, audio understanding, and reasoning accuracy.
  • Optimize inference pipelines for low‑latency, streaming applications in speech systems.
  • Collaborate with cross‑functional teams to transfer research innovations into production systems and customer‑facing applications.
  • Contribute to technical documentation, research write‑ups, and publications at top‑tier venues (NeurIPS, ICML, ACL, Interspeech).
Minimum Qualifications
  • Master's degree (required) or Ph.D. (preferred) in Computer Science, Electrical Engineering, or a related field with a focus on speech, audio ML, or multimodal learning.
  • 2+ years of industry or applied research experience in speech/audio AI, Large Language Models, or multimodal systems.
  • Demonstrated applied research contributions through publications, patents, or shipped products in speech/audio AI or LLMs.
  • Strong proficiency in Python and PyTorch, with hands‑on experience in GPU‑accelerated training for large‑scale models.
  • Solid understanding of speech and audio signal processing, acoustic modeling, and audio representations.
  • Working knowledge of modern LLM architectures (Transformers, SSMs) and training paradigms including instruction tuning and alignment methods.
  • Familiarity with modality alignment techniques: adapter‑based integration, cross‑modal attention, or audio‑text fusion methods.
  • Strong experimentation habits: clean code, systematic ablations, reproducibility, and clear technical communication.
Preferred Qualifications
  • Publication record at top‑tier venues (NeurIPS, ICML, ICLR, ACL, Interspeech, ICASSP) in audio language models, speech reasoning, or multimodal learning.
  • Hands‑on experience building or fine‑tuning Large Audio Language Models (e.g., Qwen‑Audio, SALMONN, LTU, Gemini Audio).
  • Experience with speech representation pretraining (HuBERT, Wav2

    Vec 2.0, Whisper, WavLM) and discrete speech tokenization.
  • Familiarity with Speech‑to‑Speech components: neural audio codecs (EnCodec, Sound Stream), vocoders, or speech synthesis systems.
  • Experience with audio reasoning benchmarks (AIR‑Bench, MMAU, Audio Bench) or building…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary