×
Register Here to Apply for Jobs or Post Jobs. X

Senior Data Scientist- Audio And Speech; Multimodal AI

Job in Sharon, Norfolk County, Massachusetts, 02067, USA
Listing for: Elbit Systems
Full Time position
Listed on 2026-06-03
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer
Job Description & How to Apply Below
Position: Senior Data Scientist- Audio And Speech (Multimodal AI)
?

Are you ready to push the boundaries of Audio Intelligence

We're looking for a Senior Data Scientist with deep expertise in Audio AI, Speech Processing, and Generative Modeling to design and develop advanced on‑prem multimodal systems capable of understanding, generating, and analyzing complex audio streams in noisy, real‑world environments

You'll join a world‑class Defense Tech AI team building speech‑driven solutions that enable intelligent communication, operational insight, and next‑generation human‑machine interaction

:

What You'll Do

Fine tune, and evaluate Speech‑to‑Text (STT) models optimized for noisy, low‑latency, and mission‑critical environments

Develop speaker identification and diarization ,sentiment and emotional analysis to detect tone, stress levels, and affective patterns

Design and optimize multimodal pipelines combining audio, text, and visual inputs for enhanced semantic understanding and cross‑modal reasoning

Contribute to Generative AI innovations - noise reduction, voice conversion, speech enhancement, and conversation insights

Collaborate closely with ML engineers and research peers to deploy, scale, and optimize Audio AI models on‑prem and edge hardware

Work with domain experts to adapt models for real‑time speech understanding, decision support, and behavioral insights

:

Your Expertise

Solid background in Machine Learning, Deep Learning, and Audio Signal Processing

5+ years hands‑on experience developing and deploying speech or audio‑based AI models

3+ years focused on STT / ASR, TTS, speaker recognition, or sentiment analysis

Deep familiarity with architectures such as Conformer, Whisper, RNN‑Transducer, Fast Speech / Tacotron, speaker embedding networks, and self‑supervised speech representations

Experience handling noisy, real‑time audio, latency optimization, and edge‑device constraints

Understanding of semantic embeddings, multimodal search, and RAG architectures

Strong data‑driven mindset and ability to conduct research on novel Audio AI approaches

Comfortable working with Agile workflows, MLOps, and Dev Ops principles

Publication record, Kaggle or challenge participation, or equivalent - Advantage

:

Why Join Us

Work with leading researchers and engineers on next‑generation Speech and Audio Intelligence

Make a direct impact on speech understanding, generation, and sentiment analytics in real‑world applications

Collaborate on cutting‑edge multimodal AI systems integrating vision, audio, and language

Be part of a forward‑thinking team that values creativity, research excellence, and continuous learning

Shape the future of Audio and Speech AI - from concept to deployment

Only suitable applications will be considered

#Netanya
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary