×
Register Here to Apply for Jobs or Post Jobs. X

Software Engineer - Site Reliability Engineering

Job in Foster City, San Mateo County, California, 94420, USA
Listing for: Zoox
Full Time position
Listed on 2026-05-18
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 140000 - 230000 USD Yearly USD 140000.00 230000.00 YEAR
Job Description & How to Apply Below

Zoox is seeking a Site Reliability Engineer to help ensure the availability, performance, and resilience of the services that power the development and operation of our autonomous vehicles. In this role, you will own the full lifecycle of our services—from designing fault-tolerant, maintainable systems to deploying, operating, and continuously improving them in production. As a robotics company, Zoox embraces automation at every layer of our infrastructure, and you’ll help drive that ethos forward.

You’ll work hands‑on with systems that process massive volumes of data and support compute-intensive pipelines running on both CPUs and GPUs.

In this role, you will:
  • Architect and optimize scalable systems:
    You will design, implement, and continuously improve highly reliable infrastructure, directly impacting the success and safety of Zoox’s autonomous vehicle platform.
  • Build proactive monitoring solutions:
    You will develop advanced monitoring, alerting, and reporting tools to ensure potential issues are identified and resolved before they affect production.
  • Collaborate across engineering:
    You will partner closely with software engineering teams to elevate our system architecture, streamline deployment processes, and drive automation initiatives.
  • Lead incident resolution:
    You will conduct thorough root cause analyses on production issues and rapidly deploy corrective actions to maintain a resilient and stable environment.
  • Ensure business continuity:
    You will safeguard the company's operations by designing and implementing robust disaster recovery plans to keep the Zoox fleet running smoothly under any circumstances.
Qualifications
  • SRE & Distributed Systems

    Experience:

    5+ years of experience in site reliability engineering or a similar role, with a strong, objective background in managing large-scale distributed systems.
  • Cloud & Infrastructure as Code (IaC):
    Proven experience operating within major cloud platforms (AWS, GCP, or Azure) and utilizing IaC tools like Terraform, Ansible, Salt, or Cloud Formation.
  • Container Orchestration:
    Technical expertise in deploying, managing, and scaling systems using container orchestration technologies such as Kubernetes.
  • Core Infrastructure Knowledge:
    Deep, foundational understanding of networking protocols, storage solutions, and database technologies.
  • Programming Proficiency:
    Strong, demonstrable programming and scripting skills in languages such as Python, Go, C/C++, or Java.
Bonus Qualifications
  • Experience in the automotive or autonomous vehicle industry.
  • Knowledge of security best practices and compliance requirements.

$140,000 - $230,000 a year

Accommodations

If you need an accommodation to participate in the application or interview process please reach out to  or your assigned recruiter.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary