×
Register Here to Apply for Jobs or Post Jobs. X

DevOps Engineer

Job in San Jose, Santa Clara County, California, 95199, USA
Listing for: Zoom
Full Time position
Listed on 2026-04-23
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Network Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Staff DevOps Engineer

Immigration sponsorship is not available for this position

What you can expect

We are hiring a Staff Dev Ops/Site Reliability Engineer to ensure reliability, scalability, and operational excellence for our real-time communications platform. This platform supports audio/video conferencing, recording, and live-streaming functionalities. The position requires expertise in infrastructure engineering, global team collaboration, and cross-functional partnerships.

About the Team

This team manages essential meeting service operations y handle global, large-scale distributed systems and advance communication technology to connect individuals across physical distances.

Responsibilities

Ensuring reliability engineering and operations by owning the SLO/SLI framework for real‑time services, defining, tracking, and improving latency, availability, jitter, and packet loss. Leading incident response for critical outages across the real‑time platform, coordinating across time zones and engineering disciplines. Promoting a blameless post‑mortem culture and ensuring action items lead to measurable reliability enhancements. Implementing chaos engineering and game‑day exercises to proactively identify failure modes before user impact occurs.

Building and evolving observability tools — dashboards, alerting systems, and distributed tracing — tailored for real‑time media infrastructure challenges. Serving as the architectural authority on deployment patterns, infrastructure design, and operational readiness for real‑time services. Reviewing and contributing to system design proposals, providing feedback on scalability, fault tolerance, and operational complexity. Driving capacity planning, traffic modeling, and cost optimization strategies across globally distributed infrastructure.

Evaluating and recommending infrastructure tools, platforms, and vendors — including media servers, CDN providers, cloud‑native services, and edge networking. Ensuring consistent standards for CI/CD pipelines, deployment safety, and progressive rollout strategies across teams. Acting as the primary SRE partner for multiple engineering teams building real‑time features, attending planning sessions, and providing operational readiness guidance. Collaborating closely with network engineering, security, product, and data teams to align on platform‑wide reliability requirements.

Translating infrastructure constraints and reliability trade‑offs into actionable recommendations for product leaders and engineering teams. Establishing and advocating Dev Ops best practices — infrastructure‑as‑code, Git Ops, automated testing, and deployment automation — across partner teams. Guiding senior engineers on SRE principles, reliability patterns, and operational discipline. Serving as a technical liaison between U.S.‑based and China/India‑based engineering teams, bridging communication gaps and providing technical context.

Conducting architecture reviews, incident retrospectives, and planning sessions in English and Mandarin as appropriate. Maintaining a flexible schedule to ensure meaningful overlap with teams in Beijing, Shanghai, Bangalore, and Hyderabad. Building collaborative relationships across cultural and geographic boundaries, adapting communication styles to foster trust and alignment. Ensuring engineering documentation, runbooks, and architectural decision records are accessible and understandable for global team members.

What

we’re looking for
  • 10+ years in Dev Ops, SRE, or infrastructure engineering roles, with at least 3 years at a staff or principal level scope.
  • Have a proven track record owning reliability for large-scale, distributed, latency-sensitive systems in production.
  • Have experience in supporting real-time or media-heavy platforms (video conferencing, live streaming, gaming, trading systems, or similar).
  • Demonstrate ability to lead cross-functional technical initiatives without direct authority, driving alignment across engineering, product, and operations.
  • Have conceptual and architectural understanding of real-time communication protocols:
    WebRTC, RTP/RTCP, TURN/STUN, SDP, and SFU/MCU topologies.
  • Have solid expertise in cloud…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary