Site Reliability Engineer - Senior
Listed on 2026-05-29
-
IT/Tech
SRE/Site Reliability, Cloud Computing, Systems Engineer, IT Support
Overview
Site Reliability Engineer - Senior Staff
Req
Location:
Sunnyvale, California, United States, 94089
In our ‘always on’ world, we believe it’s essential to have a genuine connection with the work you do.
At Ruckus Networks, you will work on large-scale cloud networking platforms that support enterprise customers globally. You will help improve reliability, automation, observability, and customer experience while working with modern cloud and SRE technologies in a collaborative engineering environment.
How You’ll help us connect the world:
Ruckus Networks is looking for a customer focused Senior Site Reliability Engineer (SRE) to help improve reliability, scalability, operational excellence, and customer experience across our cloud platform ecosystem.
This role is ideal for engineers who enjoy solving production problems, building automation, and improving platform reliability will work on distributed systems powering cloud networking services used by customers globally in fast paced environment.
As part of the SRE organization, you will work closely with engineering, cloud operations, and support teams to improve platform stability, observability, automation, and operational readiness.
THIS IS A HYBRID ROLE AND NEEDS TO BE ON-SITE AT OUR SUNNYVALE, CA OFFICE 3 DAYS A WEEK. NO RELOCATION OR 3RD PARTY AGENCIES PLEASE
Key Responsibilities- Operate and improve highly available, scalable cloud services and infrastructure
- Troubleshoot production issues across applications, infrastructure, networking, databases, and cloud services
- Improve observability through metrics, logging, tracing, synthetic monitoring, and alerting
- Help define and improve SLIs, SLOs, and operational health metrics
- Participate in incident response and support Sev-1/customer-impacting events
- Contribute to post-incident reviews and long-term reliability improvements
- Improve operational processes, automation, and deployment safety
- Build operational tooling and automation using Python
- Improve operational efficiency through automation and self-service tooling
- Support CI/CD improvements and deployment validation workflows
- Develop health checks, monitoring integrations, and operational diagnostics
- Support services running in Google Cloud Platform (GCP)
- Work with Kubernetes, containers, and cloud-native platforms
- Analyze scalability, performance, and resource utilization
- Collaborate with software engineering teams on operational readiness and reliability improvements
- Build dashboards, alerts, and telemetry pipelines
- Work with observability platforms such as Prometheus, Grafana, Open Telemetry, and ELK
- Support monitoring and analytics platforms including Click House
- Improve signal quality and reduce operational alert noise
- Develop synthetic monitoring focused on customer workflows
- Partner with Engineering, Product Management, Customer Support, and Cloud Operations teams
- Participate in architecture and operational readiness discussions
- Mentor junior engineers and contribute to SRE best practices
- Promote operational excellence, ownership, and customer focus
- 5+ years of experience in Site Reliability Engineering, Dev Ops, Cloud Infrastructure, or Production Engineering
- Strong programming skills in Python
- Experience with Linux systems administration and troubleshooting
- Hands-on experience with Google Cloud Platform (GCP)
- Experience with Kubernetes, containers, and cloud-native infrastructure
- Experience troubleshooting distributed systems in production environments
- Experience with observability tools such as Prometheus, Grafana, Open Telemetry, or ELK
- Familiarity with Click House or large-scale telemetry platforms
- Understanding of networking fundamentals, APIs, databases, and cloud architectures
- Experience participating in production incident response and operational support
- Experience supporting SaaS or cloud platforms at scale
- Familiarity with Kafka or event-driven architectures
- Experience building automation and monitoring solutions
- Familiarity with wireless networking or enterprise networking platforms
- Experience improving operational…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).