Engineering Manager, AI Observability
Listed on 2026-06-02
-
Software Development
AI Engineer (Applied/Software), Machine Learning/ ML Engineer
The AI Observability team at Netflix makes AI, ML, and Agentic systems transparent, reliable, and production‑ready build end‑to‑end observability for ML and GenAI workloads, capturing model inputs, features, predictions, outcomes, and behavior across online and batch systems.
Responsibilities- Partner with ML researchers, engineers, and platform teams to embed “observability‑by‑default” into new AI services, ensuring telemetry, monitoring, and evaluation are built into systems from day one.
- Lead the end‑to‑end observability strategy for AI workloads, including LLMs, generative AI systems, and classical ML models; drive build‑vs‑buy decisions and scale solutions across model training, online inference, and agent orchestration.
- Drive the evolution of LLM evaluation frameworks, covering prompt instrumentation, response quality measurement, grounding correctness, hallucination rates, and human/LLM‑as‑a‑judge scoring.
- Define and execute a platform roadmap focused on incremental delivery, with clear success metrics, migration goals, and strong adoption across teams.
- Communicate progress to stakeholders, customers, and senior leadership.
- Hire, grow, and mentor a high‑performing engineering team while fostering an inclusive and collaborative culture.
- 10+ years of software engineering experience and 3+ years of management experience.
- Experience leading teams responsible for building high‑traffic distributed systems and ML infrastructure.
- Deep familiarity with AI and ML operations, including model evaluation, drift detection, and continuous monitoring at scale.
- Experience with AI observability and monitoring tools (Arize AI, Fiddler AI, Weights & Biases, Vertex AI Model Monitoring, Sage Maker Model Monitor).
- Exposure to LLM or generative AI systems, including prompt/result logging, evaluation metrics, LLM‑as‑a‑judge frameworks, and human‑in‑the‑loop review.
- Strong technical acumen and ability to act as a credible technical advisor, set and enforce a high‑quality bar for code and system design, and mentor the team.
- Strong communication and collaboration skills, and the ability to build strong relationships with internal customers and external partners.
- A demonstrated ability to develop, drive, and execute a technical vision and roadmap.
- Experience managing a hybrid team with partners and team members distributed across (US) geographies & time zones.
Generally, our compensation structure consists solely of an annual salary; we do not have bonuses. The range for this role is $ – $.
BenefitsWe provide comprehensive benefits including Health Plans, Mental Health support, a 401(k) Retirement Plan with employer match, Stock Option Program, Disability Programs, Health Savings and Flexible Spending Accounts, Family‑forming benefits, and Life and Serious Injury Benefits. We also offer paid leave of absence programs. Full‑time hourly employees accrue 35 days annually for paid time off to be used for vacation, holidays, and sick paid time off.
Full‑time salaried employees are immediately entitled to flexible time off.
We are an equal‑opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. We approach diversity and inclusion seriously and thoughtfully. We do not discriminate on the basis of race, religion, color, ancestry, national origin, caste, sex, sexual orientation, gender, gender identity or expression, age, disability, medical condition, pregnancy, genetic makeup, marital status, or military service.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).