×
Register Here to Apply for Jobs or Post Jobs. X

AI Frontier Evaluation Engineer — End-to-End Systems

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Emeraldadvantageconcepts
Full Time position
Listed on 2026-06-25
Job specializations:
  • Software Development
    AI Engineer (Applied/Software), Backend Developer, AWS
Job Description & How to Apply Below

About the Team

We build the data, evaluation, and experimentation infrastructure powering next‑generation agentic AI systems. Our work directly supports all five leading AI labs and focuses on the hardest problems in LLM reasoning, RL environments, and human‑in‑the‑loop workflows.

We're a fast‑moving, talent‑dense team with backgrounds in quant finance, top‑tier startups, and elite engineering orgs. Revenue is already in the 8‑figure range with a steep growth curve and a major Series A on the way.

The Role

This is a broad, high‑ownership engineering role — not a narrow feature lane.

You'll work across research, infra, product, and data, owning systems end‑to‑end. Expect to touch everything from RL environments to distributed infra to full‑stack dashboards.

A typical month might include:

  • Prototyping a new RL environment from a research paper
  • Deploying distributed experiments on Kubernetes
  • Improving reliability of Next.js dashboards
  • Building a Kafka pipeline for annotator analytics

You'll shape core systems used by frontier AI labs from day one.

What You'll Do
  • Build scalable systems: RL environments, APIs, human‑in‑the‑loop platforms
  • Collaborate with research, product, and design to ship quickly
  • Write clean, maintainable code with strong documentation
  • Participate in architecture discussions and code reviews
  • Solve real‑world scalability and reliability challenges
  • Contribute to the infrastructure powering frontier AI evaluation

Who We're Looking For

We're looking for early‑career engineers who have already shown they can thrive in fast‑moving, high‑ownership environments and want to work on some of the most challenging problems in AI.

Experience
  • 1–3 years as a full‑stack software engineer
  • Background at a high‑growth startup, top quantitative trading firm, or experience as a founding engineer at a company with meaningful early traction
  • If your experience is primarily big tech, we look for a strong CS foundation (e.g., top‑tier CS programs such as Berkeley, CMU, MIT, Stanford)
Bonus Experience
  • Time spent at companies focused on human‑in‑the‑loop AI, data labeling, or AI evaluation (e.g., Surge AI, Snorkel, Scale, Labelbox, Micro1, Mercor)
  • Exposure to fast‑paced environments where you shipped features end‑to‑end and owned outcomes
What Matters Most
  • You've built real systems — not just maintained them
  • You take ownership, move quickly, and enjoy solving hard technical problems
  • You're comfortable working directly with researchers, product teams, and customers
  • You thrive in environments where the roadmap changes based on what you learn
Technical Skills
  • Full‑stack: Next.js / React, Node.js / Python
  • Infra: Kubernetes, Kafka, Redis, Elasticsearch
  • Ability to build end‑to‑end systems with high ownership
Soft Skills
  • Strong ownership and bias toward shipping
  • Comfortable being client‑facing with AI lab researchers
  • Thrives in fast‑paced, high‑iteration environments
Work Environment

5 days/week onsite in Financial District
Flexible hours
Optional half‑day or remote on Sundays
Tight‑knit, high‑trust, high‑velocity team

Why Join
  • Work directly with frontier AI labs
  • Solve the hardest problems in AI evaluation
  • Massive ownership and impact from day one
  • Build at a scale most AI startups never reach
  • Join a team of elite engineers and operators
.#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary