×
Register Here to Apply for Jobs or Post Jobs. X

Lead Cloud Site Reliability Engineer

Job in Salt Lake City, Salt Lake County, Utah, 84193, USA
Listing for: NICE
Full Time position
Listed on 2026-02-19
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Cloud Computing
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

At NiCE, we don’t limit our challenges. We challenge our limits. Always. We’re ambitious. We’re game changers. And we play to win. We set the highest standards and execute beyond them. And if you’re like us, we can offer you the ultimate career opportunity that will light a fire within you.

Overview

The Lead Site Reliability Engineer is a senior technical leader responsible for elevating the reliability, availability, and operational maturity of our SaaS platform. This role sets engineering standards for the entire SRE organization, drives platform-wide initiatives, and leads the execution of cross-team, high-impact reliability projects. You will partner closely with SRE managers, engineering teams, platform owners, and incident/problem management to shape how reliability is built, measured, and continuously improved across all services.

Organizational

Standards & Strategy
  • Define, maintain, and evangelize SRE standards, frameworks, and best practices across the entire SRE organization.
  • Establish consistent patterns for SLI/SLO design, observability instrumentation, incident response, readiness reviews, and postmortem quality.
  • Partner with architecture and engineering leadership to ensure reliability is embedded in solution design.
  • Lead multi-team efforts to reduce toil, improve quality, and increase service resilience.
Cross‑Team Technical Leadership
  • Lead large‑scale, strategic SRE initiatives requiring alignment across multiple SRE and engineering teams.
  • Serve as the technical owner for cross‑functional reliability projects—including scope, timelines, and technical decisions.
  • Provide deep technical guidance on cloud architecture, distributed systems reliability, and automation patterns.
  • Create advanced observability dashboards and distributed tracing solutions to provide visibility across product lines.
  • Automate manual operational processes to eliminate toil and increase efficiency across teams.
  • Lead and mentor engineers in performance analysis, capacity planning, and reliability‑focused system design.
  • Drive consistency and maturity in monitoring and alerting implementations across services.
Incident, Problem & Operational Excellence
  • Oversee and elevate blameless incident response and ensure high‑quality postmortems across SRE teams.
  • Partner with Incident & Problem Management to identify systemic weaknesses and lead long‑term remediation.
  • Provide highest‑tier on‑call leadership for critical incidents, guiding teams in improving MTTR and outage prevention.
Coaching, Mentorship & Team Maturity
  • Mentor senior and mid‑level SREs, uplift team capability, and provide technical coaching and training.
  • Review complex engineering work and provide robust, actionable feedback.
  • Help teams develop and adopt operational playbooks, engineering processes, and shared troubleshooting libraries.
Required Experience
  • Bachelor’s degree in Computer Science, Information Systems, or equivalent experience.
  • 6+ years in SRE, platform engineering, or cloud reliability roles.
  • Expert‑level proficiency in public cloud ecosystems (AWS, GCP, Azure).
  • Advanced programming/scripting experience (Python, Go, Java, or similar).
  • Deep experience with monitoring, automation, CI/CD, and observability tools.
  • Proven success leading complex cross‑functional engineering initiatives.
  • Outstanding communication skills for both technical and executive‑level audiences.
Bonus Qualifications
  • Experience defining SRE organizational standards or building an SRE practice.
  • Hands‑on experience with Kubernetes, microservices, Terraform, or Ansible.
  • Strong background in distributed systems and fault‑tolerant architectures.
Why This Role Matters

This role is a cornerstone leadership position for the organization. The Lead SRE will shape how reliability is engineered, enforced, measured, and improved across the entire SaaS platform—elevating how every SRE team operates and driving initiatives that make the company more reliable, scalable, and resilient for years to come.

About NiCE

NICELtd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NiCE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions.

Known as an innovation powerhouse that excels in AI, cloud and digital, NiCE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries.

NiCE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary