Senior Site Reliability Engineer
Listed on 2026-03-07
-
IT/Tech
Cloud Computing, Systems Engineer
Our Mission
Healthcare should work for patients, but it doesn’t. In their time of need, they call down outdated insurance directories. Then wait on hold. Then wait weeks for the privilege of a visit. Then wait in a room solely designed for waiting. Then wait for a surprise bill. In any other consumer industry, the companies delivering such a poor customer experience would not survive.
But in healthcare, patients lack market power. Which means they are expected to accept the unacceptable.
Zocdoc’s mission is to give power to the patient. To do that, we’ve built the leading healthcare marketplace that makes it easy to find and book in‑person or virtual care in all 50 states, across +200 specialties and +12k insurance plans. By giving patients the ability to see and choose, we give them power. In doing so, we can make healthcare work like every other consumer sector, where businesses compete for customers, not the other way around.
In time, this will drive quality up and prices down. We’re 18 years old and the leader in our space, but we are still just getting started. If you like solving important, complex problems alongside deeply thoughtful, driven, and collaborative teammates, read on.
Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed production systems. You’ll be challenged with building frameworks and processes for ensuring uptime for our patients and providers in a constantly changing environment. You’ll work with distributed systems and microservices, leveraging many interconnected services in AWS Cloud. We’re looking for someone who loves challenging the status quo and strives to make everything they touch safer, more secure, faster, and easier to maintain.
You’llenjoy this role if you are…
- Passionate about ensuring complex systems never skip a beat
- Motivated to learn new technologies, design patterns, and work in the cloud
- Comfortable in an outage situation and believe in blameless post‑mortems
- Excited to work in a highly collaborative environment with diverse individuals and numerous product development teams to improve future uptime
- Enforce a culture around strong Dev Ops and where product teams share a big role in site reliability and first response
- Autonomous, individually accountable, and always pushing to improve
- A believer that diverse and inclusive teams and cultures are non‑negotiable
- Monitoring and maintaining complex cloud‑based infrastructure, systems, and services and ensuring their uptime to help millions of patients get the care they need
- Automating and developing our tooling, processes, and infrastructure to speed up development and make them repeatable and error‑proof
- Supporting our large product engineering org with their scaling, performance, and uptime needs as well as helping diagnose and debug production related issues
- Analyzing and performance tuning systems, code, and networking for scaling and optimal operation
- Working with cutting edge GenAI tools and technology
- 5+ years of supporting consumer facing web application production environments and systems in a Site Reliability Engineering or Production Engineering role
- 2+ years of on‑call experience in a 24/7 cloud‑based production environment2+ years of experience in managing and supporting modern cloud‑based environments and infrastructure like AWS/GCP, Docker, Kubernetes, etc.
- Experience with edge technologies such as load balancers, reverse proxies, web application firewalls, routing, etc.
- Deep understanding of protocols such as TCP/IP, HTTP/HTTPS, TLS, DNS, NTP
- A Bachelor’s degree in Computer Science, Computer Engineering, or equivalent engineering experience is a plus, but not required
- Flexible, hybrid work environment at our convenient Soho location (If based in NYC)
- Unlimited Vacation
- 100% paid employee health benefit options (including medical, dental, and vision)
- Commuter Benefits
- 401(k) with employer funded match
- Corporate wellness program with Wellhub
- Sabbatical leave (for employees with 5+ years of service)
- Competitive paid parental leave and fertility/family…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).