Software Engineer - Agent Quality
Listed on 2026-05-19
-
Software Development
Data Scientist
Location: New York
Staff Software Engineer - Agent Quality
At Databricks, we are obsessed with enabling data teams to solve the world’s toughest problems, from security threat detection to cancer drug development. We do this by building and running the world’s best data and AI platform so our customers can focus on the high-value challenges that are central to their own missions.
The Databricks AI Research organization is pushing the frontier of next-generation enterprise AI. We believe a company’s data is its greatest competitive advantage, and we’re building the models and agents that unlock it. Our work spans the full stack, from model training to advanced multi‑agent systems.
As a Staff Software Engineer – Agent Quality, you will be a founding member of a new team focused on evaluating and continuously improving Databricks’ AI Agents. You will design and scale the infrastructure, tooling, and developer workflows that let researchers and engineers evaluate agents rigorously — driving a flywheel where evaluation results feed directly back into agent improvement across the full lifecycle, from development and training to production.
Theimpact you will have
- Stand up the foundational evaluation infrastructure for Genie Agents, enabling rigorous benchmarking, regression detection, and quality measurement across research and product teams.
- Build the flywheel that connects evaluation results back into agent improvement — closing the loop between production signals, training, and iterative development.
- Shape the long‑term technical direction for agent quality infrastructure, with real influence over how Databricks measures and improves its first‑party agents and agent development platform.
- Help shape the long‑term technical direction for agent quality infrastructure as Databricks expands its first‑party agents and agent development platform.
- 6+ years industry experience building software systems
- Strong Python programming skills, with experience building production or research infrastructure
- Experience building or operating distributed systems, data pipelines, or large‑scale infrastructure with a focus on reliability, correctness, and operational maturity
- Ability to design pragmatic but rigorous systems that produce trustworthy, reproducible signals for complex applications
- Comfort working across ambiguous research and product boundaries, and partnering with both researchers and engineers to turn ideas into robust internal platforms
- A high bar for technical quality, strong ownership, and the ability to influence roadmap and execution across multiple teams
- Experience with devtools, CI/CD platforms, testing frameworks, observability tooling, or benchmarking infrastructure
- Familiarity with how LLM or agent quality is measured — whether through evals, experimentation platforms, or production monitoring
Local Pay Range $190,000 — $270,000 USD
BenefitsAt Databricks, we strive to provide comprehensive benefits and perks that meet the needs of all of our employees. For specific details on the benefits offered in your region .
Our Commitment to Diversity and InclusionAt Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.
ComplianceIf access to export‑controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).