Senior Manager of Site Reliability; SASE Job San Francisco area,California USA,IT/Tech

Position: Senior Manager of Site Reliability (SASE)

Requirements

10+ years in SRE, Infrastructure or Dev Ops environments
5+ years managing global teams of 15+ engineers across multiple time zones
Deep understanding of Cloud Native ecosystems (Azure/AWS/GCP), Kubernetes and CI/CD pipelines
Proven track record of implementing ML-driven monitoring (e.g., anomaly detection, automated root cause analysis, event correlation)
Exceptional ability to translate "deep tech" into business value for C-suite stakeholders
Experience using AI tools like Claude, Gemini or Copilot to build solutions is mandatory

What the job involves

We are looking for a visionary Senior Manager of Site Reliability Engineering to lead our global SRE organization across the US and India
This isn't just a "keep the lights on" role; you will be the primary architect of our AI-driven Autonomous SRE transformation at Palo Alto Networks. You will bridge the gap between infrastructure products and operational excellence, gathering complex requirements from product teams and translating them into automated, intelligent self-service platform capabilities to ensure our systems are not just reliable, but self-healing
Directly manage and scale a high-performing, multi-geographical SRE team (US and India), fostering a culture of psychological safety, continuous learning, and "operational pride."
Standardize SRE practices globally while respecting local nuances, ensuring 24/7 coverage models (Follow-the-Sun) are seamless and burnout-resistant
Manage the financial aspects of global headcount and cloud infrastructure spend
Drive the Autonomous SRE

Roadmap:

Transition the organization from reactive monitoring to proactive, AI-driven observability and incident remediation using machine learning to reduce Mean Time to Recovery (MTTR)
Act as the lead consultant for infrastructure product teams to define what "reliability" looks like for next-gen AI services
Partner with the Platform Engineering team to build and internalize "Golden Paths" that bake in SLOs, error budgets, and automated canary analysis
Work hand-in-hand with Info Sec and Compliance to automate guardrails (Policy-as-Code) and ensure global data sovereignty requirements are met. Influence R&D leadership to prioritize non-functional requirements and technical debt reduction.

#J-18808-Ljbffr