Principal Kubernetes DevOps Engineer
Job in
San Francisco, San Francisco County, California, 94199, USA
Listed on 2026-05-29
Listing for:
Zoom
Per diem
position Listed on 2026-05-29
Job specializations:
-
IT/Tech
Systems Engineer, Cloud Computing, SRE/Site Reliability, Network Engineer
Job Description & How to Apply Below
Requirements
- We are seeking a Principal Kubernetes Dev Ops Engineer who combines deep technical expertise with broad system understanding ,
- This engineer should be capable of diving into a wide range of services and identifyingsystemic issues across architecture, CI/CD flow, and containerization environments ,
- This role requires technical leadership, analytical skill, and cross-team collaboration to drive reliability, scalability, and modernization ,
- 15+ years in Dev Ops, SRE for large-scale, production systems. successful hands-on background in Linux systems, networking, and distributed systems ,
- Possess experience operating and design low-latency, high-throughput backend services at global scale. Knowledge of media or real-time communication systems (e.g., MMR, WebRTC) ,
- Recognize knowledge of TCP/IP, routing, DNS, load balancing, and packet capture tools. Familiarity with colocation data center operations, including hardware provisioning and troubleshooting ,
- Demonstrate experience with Terraform, Ansible, Kubernetes, Docker, and modern CI/CD pipelines. successful problem-solving, debugging, and systems-level design skills ,
- Occasional weekend work may be required ,
- Ability to work across the globe or multiple time zones
- At Zoom, we’re building the next generation of Cloud and Colocation (Colo) infrastructure that powers seamless communication and collaboration for millions of users worldwide ,
- Leading deep-dive investigations across diverse services and environments. Working on real time media systems to web, team chat and AI to uncover architectural or operational bottlenecks ,
- Designing and implementing improvements in deployment pipelines, orchestration frameworks, andCI/CD automation to increase reliability and release velocity ,
- Working closely with product and service owners to enhance containerization strategy, improve resource efficiency, and reduce operational friction ,
- Partnering with the Meeting Dev Ops and Cloud Infra teams to modernize hybrid infrastructures panning colocation data centers, AWS, OCI, and other cloud providers ,
- Driving system observability, fault isolation, and resilience engineering, ensuring services meet strict availability and latency SLAs ,
- Providing technical mentorship to Dev Ops engineers and influence best practices in automation, monitoring, and release engineering. Champion a culture of data-driven reliability through postmortems, SLIs/ SLO's, and continuous performance optimization
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×