Senior Research Scientist - NLP
Listed on 2026-05-15
-
Software Development
Machine Learning/ ML Engineer, Data Scientist
About Yahoo Mail
Yahoo Mail is the ultimate consumer inbox with hundreds of millions of users. It's the best way to access your email and stay organized from a computer, phone, or tablet. With its beautiful design and lightning‑fast speed, Yahoo Mail makes reading, organizing, and sending emails easier than ever.
A Little About UsThe Mail Intelligence team is the brain behind the inbox. We are responsible for building the next generation of platforms and services that enable Yahoo to deliver deeply personalized, intelligent, and context‑aware experiences to hundreds of millions of users globally.
We process billions of messages and manage data on a petabyte scale. Using cutting‑edge algorithms, we extract knowledge and interconnect information from diverse sources to simplify our users' lives.
Building this knowledge provides many challenges in the areas of natural language processing, machine learning techniques, and big data processing on the order of petabytes. You will build tools and workflows to make it easier to manage and act on this vast information. You will apply your insights on the data to build innovative consumer applications for Yahoo.
Whether it's reinventing how people organize their day or building lightning‑fast, beautiful mobile experiences, our team is transforming the way the world connects. Yahoo Mail is the ultimate consumer inbox.
A Lot About YouYou are a seasoned Applied ML Researcher who thrives at the intersection of theoretical innovation and production‑grade execution. You don't just follow the latest LLM trends; you understand the mechanics of transformer architectures and how to optimize them for massive scale. You have expertise working across multiple ML and NLP spaces—including summarization, information extraction, classification, and ranking—at a very large scale.
You have hands‑on experience with knowledge distillation. You believe that a model is only as good as its evaluation framework and its ability to perform in a low‑latency production environment. You combine strong research fundamentals with pragmatic large‑scale production instincts and can independently drive ambiguous, high‑impact initiatives. You thrive in a research‑oriented environment and enjoy pushing the boundaries of innovation to deliver impactful solutions.
You excel in transforming theoretical research into practical applications, especially in the domains of Natural Language Processing (NLP). You are a hands‑on expert with experience training and evaluating large‑scale models, including cutting‑edge deep learning architectures. You are comfortable navigating ambiguity, driving high‑impact initiatives independently, and mentoring others.
- Lead R&D:
Drive the research and development of deep learning models specifically tailored for large‑scale email and communication data. - Model Optimization:
Fine‑tune and adapt open‑source foundation models using parameter‑efficient techniques (LoRA, adapters) and quantization‑aware training. - Efficiency at Scale:
Design and implement knowledge distillation to transfer complex capabilities into smaller, high‑performance models. - Modern Evaluation:
Develop robust evaluation frameworks, including LLM‑as‑a‑judge methodologies and human‑in‑the‑loop validation. - Product Integration:
Build repeatable, scalable training workflows for high‑throughput production environments. - Future‑Proofing:
Explore agent‑based systems, tool‑use paradigms, and long‑term generative AI roadmaps. - Mentorship:
Raise the bar for technical excellence by guiding junior researchers and contributing to the broader team.
- PhD (preferred) or Master's degree in Computer Science, Machine Learning, NLP, or a related field.
- 5+ years of hands‑on experience in applied machine learning and deep learning, with significant hands‑on work in NLP and generative models at scale.
- Demonstrated experience fine‑tuning LLMs using LoRA or other parameter‑efficient methods.
- Experience with knowledge distillation, model compression, and/or training smaller models from larger teacher models.
- Deep understanding of transformer architectures, including encoder‑only, decoder‑only, and encoder‑decoder…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).