Engineering Manager - Client Delivery & Observability
Job in
Los Gatos, Santa Clara County, California, 95033, USA
Listed on 2026-05-20
Listing for:
Netflix, Inc.
Full Time
position Listed on 2026-05-20
Job specializations:
-
IT/Tech
Systems Engineer, Cloud Computing: Infrastructure & Operations
Job Description & How to Apply Below
#
The Opportunity:
The Client Delivery & Observability (CDO) team is a newly formed group that owns two connected domains at the heart of Netflix's client engineering: release delivery, the automation platform that safely ships client application updates to Netflix's over 300 million members across TV, TV Connected and mobile devices and real-time observability, the streaming analytics and canary analysis infrastructure that detects regressions before they reach members.
Together, these systems ensure that every client release, server canary, and A/B test is both safely delivered and continuously monitored. The observability stack includes a multi-region streaming analytics platform built on Mantis, Kafka, and Druid that processes the full firehose of Netflix client telemetry This is a team that operates at the intersection of large-scale infrastructure engineering, release automation, and applied statistical methods.
The system has caught regressions in production within seconds and on the delivery side, the release automation platform orchestrates deployments across all Netflix client surfaces, using A/B-test-based canaries to safely roll out new builds to millions of devices. The team continues to extend these capabilities to support Netflix's expanding live, ads, and gaming businesses.
We are looking for an experienced Engineering Manager to lead this team through its next phase. You will lead two squads, each with an established tech lead, and your challenge will be to build a unified team identity, set direction across both domains, and rationalize a combined roadmap. The work ahead is compelling and includes scaling the observability platform to support Netflix's expanding live and ads businesses, advancing the canary analysis toolkit, evolving the release automation platform for new client architectures.
The team partners deeply with experimentation, streaming infrastructure, client platform, developer productivity, and device reliability groups, making cross-functional leadership a core part of the role.
# Links to Some of Our Work:
* Druid Caching (2026):
Stop Answering the Same Question Twice:
Interval-Aware Caching for Druid at Netflix Scale
* Sequential A/B Testing (2024):
Sequential A/B Testing Keeps the World Streaming Netflix.
* Client App Deployment (2022):
Modernizing the Netflix TV UI Deployment Process
* Safe Client Updates (2021):
Safe Updates of Client Applications at Netflix
# Responsibilities:
* Team Leadership & Talent Development:
You will lead a newly formed team spanning two squads with distinct technical domains, each with its own tech lead. You will hire, coach, and grow a high-performing team of engineers, developing your tech leads into effective technical leaders while fostering the professional growth of the broader team through regular, constructive, and empathetic feedback. You will build a unified team identity across the two squads and ensure that quieter voices have space alongside louder ones.
You're not afraid of hard conversations.
* Vision & Execution:
Develop and execute a unified roadmap that balances two domains: release delivery infrastructure and real-time observability systems. You will rationalize inherited priorities across both squads, coach your team on the design and scaling of robust and reliable systems, and ensure that the team's work compounds rather than fragments.
* Cross-Functional Partnership:
Build strong relationships across a wide stakeholder surface: client platform teams, streaming infrastructure, experimentation, developer productivity, and device reliability. You will collate and translate complex partner requirements into actionable engineering goals. You'll act as a primary evangelist, extracting context from stakeholders to fill gaps while promoting the team's current and future capabilities.
* Operational Excellence:
Own the reliability and operational health of the team's production systems. You will drive incident response practices, ensure sustainable on-call rotations, and balance investment in reliability and technical debt against feature delivery. You will establish and track service-level objectives that keep the team accountable to its consumers.
# Skills & Characteristics:
* Cross-Domain Leadership:
You are comfortable leading across distinct technical domains without being the deepest expert in either. You trust your tech leads for domain depth and focus your energy on setting direction, removing obstacles, and connecting the dots between teams.
* Context, Not Control:
You're allergic to micromanagement. You prefer to give context, delegate, and watch amazing people…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×