×
Register Here to Apply for Jobs or Post Jobs. X

Senior Site Reliability Engineer

Job in Salt Lake City, Salt Lake County, Utah, 84193, USA
Listing for: O.C. Tanner
Full Time position
Listed on 2026-03-12
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Cloud Computing, Systems Engineer, IT Support
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

O.C. Tanner is the #1 provider of employee recognition solutions, helping organizations worldwide create thriving workplace cultures. Our mission is simple yet powerful: we help people thrive at work by fostering appreciation, engagement, and connection. Through our award-winning recognition platform, we empower companies to celebrate achievements, strengthen relationships, and build workplaces where employees feel valued and inspired.

About the Role

We are seeking a Senior Site Reliability Engineer to join our team and help ensure the reliability, scalability, and performance of our world class employee recognition platform. This role is ideal for someone who thrives at the intersection of software engineering and operations, with a passion for building resilient systems and improving customer experience.

Key Responsibilities
  • Reliability & Performance:
    Design, implement, and maintain monitoring, alerting, and tracing solutions for cloud-native applications. Drive improvements in uptime, latency, and overall system performance.
  • Observability:
    Develop and manage observability platforms (e.g., Open Telemetry, Datadog, Coralogix) to provide actionable insights. Collaborate with engineering teams to define metrics, logs, and traces for new and existing services.
  • Incident Management:
    Lead incident response efforts, including root cause analysis and postmortems. Implement best practices for incident detection, escalation, and resolution.
  • Cloud & Infrastructure:
    Manage and optimize Kubernetes clusters and AWS cloud resources. Automate infrastructure provisioning and scaling using Infrastructure-as-Code tools.
  • Collaboration:

    Partner with software engineering teams to gather requirements for monitoring and reliability. Advocate for SRE principles and help teams adopt best practices for resilience and performance.
  • On-Call Responsibilities:
    Participate in a rotating on-call schedule to ensure 24/7 coverage for critical systems. Respond to alerts promptly, troubleshoot issues, and restore service during outages. Continuously improve on-call processes to reduce noise and enhance response efficiency.
Required Qualifications
  • Experience:

    5+ years in Site Reliability Engineering, Dev Ops, or related roles.
  • Programming

    Skills:

    Proficiency in at least one modern programming language (e.g., Python, Go, Java).
  • Experience with Infrastructure-as-Code tools (Terraform, Cloud Formation).
  • Observability Expertise:
    Hands‑on experience with Open Telemetry, Datadog, or similar platforms.
  • Cloud & Containers:
    Strong knowledge of AWS services and Kubernetes.
  • Incident Management:
    Proven track record in managing production incidents and improving MTTR.
  • Monitoring & Alerting:
    Deep understanding of metrics, logging, tracing, and alerting strategies for distributed systems.
  • Collaboration:

    Ability to work closely with software engineers to design reliable systems and improve application performance.
  • On-Call Readiness:
    Comfortable with participating in on-call rotations and handling high-pressure situations.
Preferred Qualifications
  • Familiarity with CI/CD pipelines and automation frameworks.
  • Knowledge of e-commerce platforms and high-traffic systems.
Why Join Us?
  • Work on a high-scale e-commerce platform impacting millions of customers.
  • Collaborate with talented engineers in a culture that values reliability, innovation, and ownership.
  • Competitive compensation, benefits, and opportunities for growth.
#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary