×
Register Here to Apply for Jobs or Post Jobs. X

Senior Machine Learning Engineer

Job in Chicago, Cook County, Illinois, 60290, USA
Listing for: XSELL Technologies
Part Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below

Overview

SR.Machine Learning Engineer — Agentic Voice (Healthcare)

Location:

West Loop, Chicago, IL (Hybrid — 3 days/week in office)

Job Summary

We’re looking for a hands-on, entrepreneurial Senior Machine Learning Engineer who has already taken voice-centric AI systems (TTS, STT, LLM-driven dialog) from prototype to planet-scale production. You will own the full lifecycle of our ML stack—research, data pipelines, training, evaluation, deployment, and relentless optimisation—so that millions of patients can have natural, sub-second conversations with our Agentic Voice platform. You’ll collaborate tightly with product, infra, and compliance teams, set a high technical bar for ML excellence.

What sets this role apart: You'll specialize in creating highly optimized, domain-specific conversational AI models by fine-tuning and compressing existing LLMs and specialized conversational architectures for specific use cases. We need someone who can rapidly research, prototype, and deploy smaller, faster, cheaper models that outperform general-purpose solutions in conversational settings — achieving 10x speed improvements and 90% cost reductions while building efficient pipelines for intent classification, dialogue management, and text-based optimization systems that improve conversational quality of our dialogue systems.

Responsibilities
  • Advanced Model Optimization & Fine Tuning
    Apply LoRA, QLoRA, DPO, RLHF and parameter-efficient methods to create smaller, faster models optimized for conversational contexts; implement quantization, pruning, knowledge distillation to reduce model size while preserving quality; work with modern conversational architectures (DeBERTa, Set Fit, sentence transformers, lightweight decoder models) for domain-specific use cases; rapidly evaluate and adapt latest research for conversational applications.
  • End-to-End ML Engineering
    Design, build, and maintain high-performing STT, TTS, and LLM pipelines that operate at < 800 ms end-to-end latency and thousands of concurrent calls; train and fine-tune smaller, task-specific LLMs optimized for real-time accuracy, latency and cost.
  • Inference at Scale
    Optimize GPU- and CPU-based serving on EKS / Kubernetes using dynamic batching, quantisation, speculative decoding, and streaming gRPC / Web Sockets; extend Lang Graph / Lang Chain flows and Model Context Protocol (MCP) schemas to orchestrate complex multi-turn healthcare conversations safely and compliantly.
  • Data & Evaluation
    Build robust data pipelines (Kafka → Snowflake / S3) for conversation logs; design offline and online evaluation frameworks for ASR/WER, TTS MOS, and task-completion metrics.
  • Technical Leadership
    Establish ML best practices—versioning, monitoring, A/B gating, CICD for models—and mentor engineers on ML ops, audio processing, and prompt engineering.
  • Cross-Functional Collaboration
    Work daily with product managers, designers, compliance leads, and customer teams to translate business goals into scalable voice experiences; stay on the cutting edge of open-source speech and LLM research; run rapid POCs (e.g., Whisper-v3, Bark); explore efficient fine tuning techniques (LORA, DPO); continuously improve model performance in production environments.
  • Reliability & Compliance
    Ensure HIPAA-grade security, auditable PHI handling, guardrails, and fallback strategies to keep conversations safe and reliable 24 × 7.
Qualifications

Education

  • B.S. or M.S. in Computer Science, Machine Learning, or related field.

Experience

  • 7+ years building production ML systems, 2+ specifically in speech / conversational AI.
  • Proven track record shipping voice AI or large-scale LLM products to tens-of-millions of users or thousands of concurrent sessions.

Technical Expertise

  • Advanced Fine-tuning & Model Compression: Proven experience with parameter-efficient fine-tuning techniques (LoRA, QLoRA, adapters) for conversational applications; knowledge of few-shot learning frameworks for conversational tasks with limited data; experience with model compression techniques (quantization GPTQ/AWQ, pruning, knowledge distillation) for real-time inference.
  • Speech: Deep understanding of ASR (Whisper, NeMo, Kaldi) and TTS (Tacotron, Fast…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary