Founding Agent Harness Engineer
Listed on 2026-06-24
-
Software Development
AI Engineer (Applied/Software), AI Reliability/ Performance Engineer, Backend Developer, AI QA / Validation Engineer
Founding Agent Systems Engineer
Location:
Onsite in Palo Alto
Compensation:
Competitive Salary + Equity
Model AI is building the infrastructure and application stack for the next generation of agentic AI systems.
We believe future AI applications will involve many agents working together: using tools, sharing context, evaluating their own behavior, modifying systems, and improving over time. On top of our Agent Cloud infrastructure, we are building an agentic system: an application layer that can continuously optimize models, codebases, workflows, and software systems.
In the near term, this means building agent systems that help codebases evolve with minimal human supervision: cleaner abstractions, stronger tests, lower technical debt, faster iteration, and better engineering velocity. Over time, we believe the same approach can extend to cloud optimization, database optimization, enterprise workflows, and more general applications.
About This RoleWe are looking for a Founding Agent Systems Engineer to build the agent harnesses, evaluation systems, workflows, and feedback loops behind our product.
This role is ideal for someone who is strong in Python and systems engineering, understands LLM agents and evaluation, and enjoys turning research ideas into working products. You will work on the systems that allow agents to collaborate, inspect code, make changes, run tests, measure progress, and improve over time.
This is a deeply technical and product-oriented role at the intersection of LLM agents, evaluation, developer tools, workflow automation, and applied AI systems.
What You'll Do- Build agent harnesses for multi-agent workflows and ralph loops.
- Design evaluation environments to measure agent performance, reliability, cost, latency, and quality.
- Build testing systems that connect agent behavior, code changes, and evaluation results.
- Create workflows where agents can inspect code, make changes, run tests, and determine whether performance improved.
- Develop shared-memory, coordination, and task-management systems for teams of agents.
- Build infrastructure for tracking task success, regressions, cost, latency, quality, and long‑term progress.
- Work on applied coding‑agent workflows, including refactoring, testing, debugging, code review, and technical‑debt reduction.
- Support customer demos and applied workflows that turn agent research into usable product experiences.
- Collaborate closely with ML systems engineers to make agent workloads run efficiently on Agent Cloud.
- Help turn research prototypes into reliable, measurable, production‑quality systems.
- Strong Python engineering skills.
- Experience building LLM agents, evaluation harnesses, developer tools, applied AI systems, or automation workflows.
- Strong systems instincts and the ability to build reliable, debuggable tooling.
- Experience with testing, benchmarking, CI, code analysis, or automated software workflows.
- Ability to reason about agent behavior, failure modes, evaluation quality, and product usefulness.
- Comfort working across agent logic, backend systems, infrastructure, and user‑facing product requirements.
- Strong product intuition and the ability to build demos that can evolve into real products.
- High ownership and the ability to operate effectively in an early‑stage startup environment.
- Coding agents, eval harnesses, SWE‑bench‑style environments, or tool‑use systems.
- Multi‑agent or long‑running agent workflows.
- LLM observability, tracing, evals, or RL workflows.
- CI/CD, automated testing, static analysis, or developer tooling.
- Hands‑on technical excellence and strong engineering judgment.
- End‑to‑end ownership, from design to implementation to production outcomes.
- Bias for action: ship quickly, learn from failures, and iterate.
- High intensity during critical milestones, with a focus on real customer impact.
- Ability to do deep, focused work and sustain execution.
- Clear communication with teammates, customers, and stakeholders.
- Comfort with ambiguity, rapid change, and wearing multiple hats.
- Low ego, high integrity, high accountability, and strong collaboration.
- Continuous learning and a belief that judgment, intelligence, and capability compound over time.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).