×
Register Here to Apply for Jobs or Post Jobs. X

Senior Director - Reliability Operations

Job in San Francisco, San Francisco County, California, 94102, USA
Listing for: Gap
Full Time position
Listed on 2026-06-28
Job specializations:
  • IT/Tech
    IT Project Manager, SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Systems Engineer
Job Description & How to Apply Below

Senior Director
- Reliability Operations

The Senior Director
- Reliability Operations, is a strategic leader accountable for ensuring the reliability, availability, and performance of the enterprise technology ecosystem. This role oversees all ITIL-based service management functions, Site Reliability Engineering (SRE), the Service Now Platform, Mission Control, and Live Sight Insights. This leader drives operational excellence through a proactive reliability strategy that combines process discipline, automation, observability, and real-time insights.

They will partner closely with engineering, infrastructure, cybersecurity, and product teams to build and sustain resilient systems that power Gap Inc.'s digital and in-store experiences. As a thought leader, the Sr. Director will shape the long-term vision for operational reliability and service management—defining modern capabilities, optimizing service performance, and establishing an innovation-driven reliability culture.

The responsibilities include:

Strategic Leadership & Vision

  • Define and execute the enterprise Reliability Operations strategy, ensuring alignment with business objectives and technology roadmaps.
  • Lead transformation of ITIL functions into agile, data-driven service management capabilities across incident, problem, change, and configuration management.
  • Partner with senior technology and business leaders to embed reliability and performance metrics into product development and operational planning.

Operational Excellence & Reliability Engineering

  • Lead Site Reliability Engineering (SRE) practices across platforms and services—driving automation, self-healing capabilities, and proactive monitoring to achieve measurable service resiliency improvements.
  • Establish standards for availability, latency, scalability, and operational efficiency through engineering-driven reliability principles.
  • Champion reliability by design—ensuring observability, capacity planning, and chaos testing are core to delivery processes.

Mission Control & Live Sight Insights

  • Oversee the Mission Control organization responsible for real-time system monitoring, incident command, and critical event management.
  • Drive adoption of Live Sight Insights to create predictive and actionable intelligence on service health and performance trends.
  • Enable enterprise visibility of key metrics through intuitive dashboards and business-impact-based alerting models.

Service Now Governance Ownership

  • Own the Service Now Platform governance strategy and roadmap, ensuring it enables ITIL process excellence, automation, and cross-enterprise workflow integration.
  • Collaborate with product and engineering teams to provide industry best practices for Service Now's capabilities including IT, HR, Security, and Enterprise Operations.
  • Lead a platform governance mindset—focusing on reliability, scalability, and ease of use.

People Leadership & Culture

  • Build, inspire, and develop a high-performing global Reliability Operations team that embodies accountability, collaboration, and innovation.
  • Foster a culture of data-driven decision making, continuous learning, and operational excellence.
  • Serve as a mentor and coach to emerging leaders—raising the organizational bar for reliability engineering and service leadership.

Cross-Functional Partnership

  • Work closely with Software Engineering, Infrastructure, Cybersecurity, and Business Technology teams to ensure reliability objectives are integrated end-to-end.
  • Partner with Enterprise Architecture and Program Management to align technology investments with reliability outcomes.
  • Act as a trusted advisor to executive leadership on reliability strategy, risk posture, and performance health of the enterprise environment.

Who you are:

  • Proven strategic leader with success driving operational transformation at scale in global, complex environments for more than 10 years.
  • Deep expertise in ITIL frameworks, SRE principles, Service Now platform administration and architecture, and modern observability practices.
  • Strong technical understanding across infrastructure, cloud operations, automation, and service management ecosystems.
  • Exceptional ability to influence at all levels—translating…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary