×
Register Here to Apply for Jobs or Post Jobs. X

Head of Research

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Vals AI
Full Time position
Listed on 2026-06-26
Job specializations:
  • Research/Development
    AI Evaluation, Research Scientist
Salary/Wage Range or Industry Benchmark: 150000 - 200000 USD Yearly USD 150000.00 200000.00 YEAR
Job Description & How to Apply Below

Measuring intelligence is hard, and humans haven't been particularly good  proxies we've used — IQ, standardized tests, credentials — have shaped how we develop intelligence and how we value it, often in ways we later regret. AI gives us a chance to do better. The field is young enough that the methodologies for measuring what these systems can actually do are still being written, and the answers we settle on will shape what gets built, what gets deployed, and which workflows get automated next.

Vals is building the measurement layer for the AI economy: the benchmarks, methodologies, and standards that determine which models ship and where they get trusted. We're hiring a Head of Research to lead it.

Responsibilities

Concretely, you'll:

  • Advance the science of evaluation. The methodologies the field uses today — judge models, human-in-the-loop, static benchmarks — were built for a previous generation of models and break down on long‑horizon, real‑world tasks. You'll develop the new paradigms.
  • Oversee Vals' broader research portfolio, setting direction across the projects already underway and the ones we haven't started yet.
  • Publish work that moves the field forward. We want Vals' research to be cited, not just shipped.
  • Recruit and grow a research team alongside the founders.
  • Work directly with our enterprise customers and lab partners on the evaluation problems they actually have.
Requirements
  • A PhD in ML/NLP (in progress or completed), or equivalent industry research track record.
  • Deep familiarity with the LLM evaluation landscape: existing benchmarks, their failure modes, judge‑model approaches, human‑in‑the‑loop methodologies.
  • A bias toward research that affects what people actually deploy, rather than benchmarks that are easy to game.
  • Strong written and verbal communication. You'll publish, present, and talk to customers and labs.
  • Ability to work in‑person, in San Francisco.
Nice to Haves
  • A widely‑cited benchmark or eval framework you've built or co‑built.
  • Prior experience at a frontier lab (Anthropic, OpenAI, Google Deep Mind, Meta FAIR) or a research‑led startup.
  • Domain depth in one or more of our verticals (legal, finance, insurance, healthcare).
  • Experience leading or mentoring other researchers.
  • A public research presence: papers, blog posts, talks, or open‑source contributions people in the field recognize.
What We Offer
  • Highly competitive salary and equity. Excellence is well rewarded.
  • Relocation and transportation support.
  • Health/dental insurance coverage.
  • Lunch and dinner provided, free snacks/coffee/drinks.
  • 401(k) plan.
  • Unlimited PTO.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary