Lead AI Engineer
Listed on 2026-03-15
-
Software Development
AI Engineer, Machine Learning/ ML Engineer
Overview
Lead AI Engineer – Marketing & Service AI role based in Palo Alto. The position is part of a hub‑first model with regular in‑office collaboration expected.
About the RoleKlaviyo is investing in Marketing AI and Service AI to power AI‑native experiences across the company. As a Lead AI Engineer in the AI & Analytics organization, you will design and build scalable backend systems and user experiences that power AI products and agentic solutions. You will own complex services end‑to‑end, contribute to architecture for high‑impact AI features, and partner with product managers, ML engineers, and data scientists to turn AI ideas into reliable, scalable production capabilities.
This is a hands‑on, backend‑heavy role with opportunities to influence architecture, async processing pipelines, distributed systems, and more—without being the overall area tech lead.
You will be based in our Palo Alto office, collaborating with engineers in Boston and other locations, with a path to grow into broader technical leadership or future people leadership if desired.
What You’ll Do- Design and build core AI services. Implement scalable, low‑latency backend systems and APIs powering Marketing and Service AI capabilities for 183K+ customers, handling billions of events and interactions.
- Scale AI data and inference pipelines. Develop robust data collection and processing pipelines so generative and agentic models have the features, context, and feedback needed for production.
- Build and harden AI serving systems. Host and orchestrate AI models (LLMs, tools, evaluators, retrieval systems) with strong contracts, logging, and observability.
- Evolve our agentic architecture. Improve how agents plan, call tools, and react to feedback to increase autonomy, reliability, and safety for customer‑facing and internal workflows.
- Apply and refine best practices for AI systems. Establish evaluation, safety/guardrails, prompt and model management, offline/online tests, and incident response.
- Collaborate across teams. Work with product, ML, data, and platform teams to clarify requirements and unblock dependencies across hubs and time zones.
- Mentor and uplevel teammates. Provide code reviews, share patterns, and help mid‑level and junior engineers grow in building distributed and AI‑powered systems.
- Help shape the Palo Alto hub. Participate in interviewing, onboarding, and local engineering rituals; contribute ideas to make Palo Alto a strong, collaborative hub.
- Measure what matters. Instrument services and use metrics (availability, latency, cost, agent success rates, eval scores, customer adoption) to guide decisions and influence roadmaps.
- You’ve already practiced agentic coding and are excited to explore new AI tools and workflows responsibly.
- Experienced backend engineer. 5–7+ years of software engineering with a focus on backend and distributed systems; led complex projects end‑to‑end and owned production services.
- Hands‑on with generative & agentic AI in production. Built and shipped generative or agentic AI applications (LLM flows, tool‑using agents, retrieval‑augmented systems); comfortable with prompt design, few‑shot approaches, fine‑tuning, and evaluation.
- Strong distributed systems and async background. Built reliable services, async processing pipelines, and distributed queues (e.g., Celery, Kafka, SQS, Rabbit
MQ, Redis). - Fluent in Python and data tooling. Proficient in Python and backend frameworks (FastAPI, Django or similar); experience with Spark/Hadoop and ORMs like SQL Alchemy/Alembic.
- Cloud‑native experience. AWS and Kubernetes, CI/CD, observability, and best practices; understand impact of infrastructure on reliability, latency, and cost.
- Evaluation and quality‑minded. Experience building AI evals, instrumentation for quality, balancing latency, cost, and response quality.
- Collaborative and customer‑first. Enjoy working with PMs, engineers, and sometimes customers; use data and feedback to guide decisions.
- Growth‑oriented teammate. Clear communication, constructive feedback, and a culture of humility and ambition.
- AI experience and curiosity. Experience with AI in work or personal projects; eager to deepen agentic…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).