×
Register Here to Apply for Jobs or Post Jobs. X

Senior Production Engineer

Remote / Online - Candidates ideally in
San Jose, Santa Clara County, California, 95199, USA
Listing for: Zscaler
Part Time, Remote/Work from Home position
Listed on 2026-06-03
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Salary/Wage Range or Industry Benchmark: 140000 - 200000 USD Yearly USD 140000.00 200000.00 YEAR
Job Description & How to Apply Below
Position: Senior Staff Production Engineer

Zscaler accelerates digital transformation to ensure our customers can be more agile, efficient, resilient, and secure. As an AI-forward enterprise
, we are constantly pushing the envelope, leveraging the world’s largest security data lake to power our cloud-native Zero Trust Exchange platform. This innovation protects our customers from cyberattacks and data loss by securely connecting users, devices, and applications in any location.

Here,
impact in your role matters more than title and trust is built on results. We say, impact over activity. We seek innovators who actively use AI to amplify their impact and who thrive in an environment where we leverage intelligent systems to stay ahead of evolving threats. We believe in transparency and value constructive, honest debate
—we’re focused on getting to the best ideas, faster. We build high-performing teams that can make an impact quickly and with high quality. To do this, we are building a culture of execution centered on customer obsession
, collaboration, ownership, and accountability.

We value high-impact, high-accountability with a sense of urgency where you’re enabled to do your best work and embrace your potential. If you’re driven by purpose, thrive on solving complex challenges, and want to be part of the team that’s helping to secure the AI age, we invite you to bring your talents to Zscaler and help shape the future of cybersecurity.

Role

We are looking for a Sr. Staff Production Engineer to join our team. This role is available as a hybrid opportunity 3 days a week in San Jose, CA or as a remote position, reporting to Production Engineering in the Cloud Infrastructure & Operations department. Join Zscaler to be a force multiplier for the reliability of a global platform protecting over 15 million users.

In this role, you will provide the technical vision and hands‑on execution to drive an “automation‑first” culture across the company. By maturing our observability and architectural standards, you will directly reduce our Mean Time to Mitigate (MTTM) and shape the scalability of our globally distributed, multi‑cloud infrastructure.

What you’ll do (Role Expectations)
  • Design and implement highly available, scalable infrastructure across AWS, Azure, GCP, and bare‑metal environments
  • Drive an “automation‑first” culture by writing code (Python/Go) to eliminate manual toil and build self‑healing systems
  • Implement and maintain sophisticated observability (Prometheus, Grafana, Open Telemetry), define SLIs/SLOs, and establish error budgets
  • Act as a lead Incident Commander (TDO on‑call), develop response playbooks, and conduct deep‑dive post‑incident analyses
  • Partner with Engineering and partner teams to conduct operability reviews
Who You Are (Success Profile)
  • You act like an owner with a bias for action and integrity.
  • You are a pragmatic builder obsessed with creating, iterating, and shipping.
  • You champion simplicity by distilling complex problems into clear, actionable plans.
  • You are data‑driven, valuing evidence over assumptions.
  • You think at scale, building solutions and processes built to last a high‑growth global organization.
What We’re Looking for (Minimum Qualifications)
  • 8+ years of experience managing reliability, scalability, and availability for large‑scale production services
  • Deep expertise in programming (e.g., Python, Go, or C/C++)
  • Strong background in networking protocols, Linux/FreeBSD systems, and distributed architecture
  • Experience in high‑stakes incident management and participation in a 24/7 on‑call rotation
  • Proficiency in leveraging ITIL frameworks and incident data to drive service maturity through systematic problem management and technical operability reviews
What Will Make You Stand Out (Preferred Qualifications)
  • Extensive experience with public cloud (AWS, Azure, GCP) and Infrastructure‑as‑Code (Ansible, Terraform)
  • Experience with chaos engineering and disaster recovery planning at scale
  • Expertise in global routing (BGP) and traffic tunneling (GRE, IPSec) with a deep understanding of L7 proxy architectures (HAProxy), DNS at scale, and OS networking stack internals

Base Pay Range

$140,000 - $200,000 USD

At Zscaler, we are committed to building a…

Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary