AI Researcher Job San Francisco area,California USA,Software Development

Engineering at Ivo

Engineers at Ivo are inventors. Ivo was first‑to‑market with

An AI agent that lives in MS Word and edits the document for you [2023]
Ditching imprecise embeddings models in favor of agentic RAG [2023]
Large‑scale LLM‑based legal fact extraction [2024]
A legal assistant that can search large contract databases without sacrificing accuracy [2024]
Clustering legal documents descended from the same family [2025]
Automatic deviation analysis to locate buried risk in huge contract databases [2025]
Merging contracts with their amendments to produce a time‑series of “composite contracts” (a customer actually cried when we showed her this) [2025]

The Role:

Why, What, and Who

AI researchers are the engine of innovation 'll push the state‑of‑the‑art in applying LLMs, deep learning, and advanced AI techniques to the unique and high‑stakes challenges of the legal domain — where accuracy, explainability, and robustness aren't nice‑to‑haves, they're the product. Your research will translate directly into features that redefine how legal professionals work.

This isn't an academic role. It's a chance to see your research fundamentally change an industry.

What you’ll do

You’ll own a research roadmap end‑to‑end: identifying the right problems, designing experiments, prototyping, and shipping the winners into production alongside the engineering team.

Advance the core AI platform by designing and implementing novel approaches to the problems at the heart of Ivo’s product: reasoning over long‑context legal corpora, contract comparison and redlining, information extraction, and automated drafting and editing.
Make our models trustworthy by conducting research on procedural hallucination detection and resolution, calibration, and explainability. In a domain where a single fabricated citation can sink a deal, the bar for groundedness is uncompromising — your job is to keep raising it.
Push frontier techniques into production by exploring and applying advanced fine‑tuning, PEFT, and distillation techniques to make our models faster, cheaper, and more accurate on legal‑specific tasks. Evaluate emerging work in agentic systems, long‑context modeling, and reasoning, and figure out which ideas actually move the needle for our customers.
Build the evaluation infrastructure by designing and maintaining datasets, benchmarks, and evals for training and measuring model performance on complex legal text. Define the metrics that matter, and hold the team to them.
Ship prototypes to production by partnering closely with Engineering and Product, writing internal reports that influence the technical direction of the platform, and presenting findings to both technical and non‑technical audiences across the company.

Who you’ll be

Required
A Ph.D. in Computer Science, Engineering, Mathematics, Physics, or a related quantitative field — or equivalent industry research experience with a comparable track record.
Evidence of exceptional ability — a paper, shipped system, open‑source contribution, competition result, or hard problem you cracked that puts you meaningfully ahead of your peers.
Deep, hands‑on experience in deep learning research and development, particularly with LLMs. Strong working knowledge of modern frameworks (PyTorch, JAX, or Tensor Flow) and the surrounding open‑source ecosystem.
Expertise in at least one of the following: agentic systems, reasoning, parameter‑efficient fine‑tuning (PEFT) methods, quantization, inference optimization (e.g., speculative decoding), hallucination mitigation, novel architectures in deep learning, or robust evaluation methodology for LLMs.
Excellent communication skills, with the ability to articulate complex research findings clearly to both technical and non‑technical audiences.
A bias toward action: you ship rather than perfect, you measure rather than guess, and you’d rather have a working prototype today than a polished plan next week.
Nice to Have
Publications at top venues — e.g., NeurIPS, AAAI, ICML, ICLR — or peer‑reviewed journals in adjacent quantitative fields (e.g., Journal of Computational Physics, SIAM journals, Journal of the ACM, Nature, Science, PNAS) — or equivalent strong…