Forward Deployed Research Scientist Job San Francisco area,California USA,IT/Tech

Shape the Future of AI

At Labelbox, we're building the critical infrastructure that powers breakthrough AI models at leading research labs and enterprises. Since 2018, we've been pioneering data-centric approaches that are fundamental to AI development, and our work becomes even more essential as AI capabilities expand exponentially.
About Labelbox

We're the only company offering three integrated solutions for frontier AI development:

Enterprise Platform & Tools:
Advanced annotation tools, workflow automation, and quality control systems that enable teams to produce high-quality training data at scale

Frontier Data Labeling Service:
Specialized data labeling through Alignerr, leveraging subject matter experts for next-generation AI models

Expert Marketplace:
Connecting AI teams with highly skilled annotators and domain experts for flexible scaling

Why Join Us

High-Impact Environment:
We operate like an early-stage startup, focusing on impact over process. You'll take on expanded responsibilities quickly, with career growth directly tied to your contributions.
Technical Excellence:
Work at the cutting edge of AI development, collaborating with industry leaders and shaping the future of artificial intelligence.
Innovation at Speed:
We celebrate those who take ownership, move fast, and deliver impact. Our environment rewards high agency and rapid execution.
Continuous Growth:
Every role requires continuous learning and evolution. You'll be surrounded by curious minds solving complex problems at the frontier of AI.
Clear Ownership:
You'll know exactly what you're responsible for and have the autonomy to execute. We empower people to drive results through clear ownership and metrics.

Role Overview

Alignerr is Labelbox's human data organization - we produce the training data that frontier AI labs use to build their most capable models. Our Forward Deployed Research Team sits at the intersection of research science and client delivery, embedding research capability directly into the engagements that drive our business.

This is not a traditional research scientist role. You will not spend months pursuing a single research question. You will work on multiple client engagements simultaneously, operating on timescales of days to weeks. You will sit in scoping meetings with research teams at major AI labs, reason scientifically about data strategy in real time, fine-tune open-weight models to validate our data methodology, and collaborate with our Applied Research team to turn client-grounded findings into published work.

The pace is fast, the problems are applied, and the feedback loops are short.

We are looking for someone who finds that energizing, not compromising.
Your Impact

Engage directly with frontier lab research teams. You will be in the room during client scoping meetings - not as support staff, but as a technical peer. You'll engage on methodology, challenge assumptions about data requirements, and shape project specifications based on a scientific understanding of how data composition affects model outcomes.

Develop deep scientific understanding of client engagements. For each project, you will build a working model of the client's architecture, training methodology, and target capabilities. You'll use this understanding to reason about why a particular data strategy will or won't work, identify risks early, and iterate with empirical grounding - not intuition.

Run ablation studies and fine-tune open-weight models. You will fine-tune models on client data (and proxy data) to empirically measure the impact of our data on model performance. This is how we validate that what we deliver actually improves our customers' models - and how we catch problems before the client does.

Consult on workflow and quality systems. You will partner with our Human Data Operations team to review annotation schemas, task designs, and quality rubrics before projects go into execution. Your job is to ensure the spec is technically sound - that the data we produce will actually serve the client's training objectives.

Collaborate with Applied Research on publications and benchmarks. Our Applied Research team owns the long-horizon research agenda. Your role is to feed them signal from the field - generalizable findings, reusable methodologies, empirical results - and help drive joint projects to completion. You will contribute to benchmarks, white papers, and conference submissions that establish Labelbox's research credibility.

What You Bring

Required
- MS or PhD in Machine Learning, NLP, Computer Science, or a related quantitative field.
- Hands-on experience fine-tuning large language models (open-weight models such as Llama, Mistral, Qwen, or similar).
- Strong understanding of LLM training pipelines - pretraining, supervised fine-tuning, RLHF/DPO, and how data quality and composition affect each stage.
- Experience designing and executing experiments with rigor - hypothesis formation, controlled comparisons, statistical analysis of results.
- Ability to operate …