Site Reliability Engineer Job Seattle area,Washington USA,IT/Tech

This role is the team facing, consultative side of observability. The senior engineer partners

directly with internal engineering teams to understand their systems, pain points, and reliability

gaps. They translate team needs into observability solutions: dashboards, metrics, SLOs, SLIs,

alerting strategies, and visibility improvements.

How this role works day to day:

Meet with internal teams to gather technical and operational requirements
Design and implement tailored observability solutions across tools like Grafana, Sumo, App Dynamics, and New Relic
Build deeper dashboards for product teams and executive visibility
Define and maintain SLOs, SLIs, and reliability reporting patterns
Identify gaps in monitoring or alerting and lead the solutioning
Partner with embedded SREs across hub and spoke model
Influence tool consolidation, standards, and enterprise reliability strategy

Top 3

Skills:

Advanced Grafana Expertise - Strong ability to create complex dashboards, build transformations, define SLOs/SLIs, and integrate with multiple data sources.
SRE Principles and System Thinking - Deep understanding of service health, SLOs, SLIs, error budgets, incident patterns, distributed systems, and reliability engineering fundamentals.
Cross Team Collaboration and Technical Requirements Gathering - Ability to sit with teams, understand their needs, translate them into observability solutions, and deliver dashboards, alerting, and reliability patterns.

Core Responsibilities:

Required Skills:

3+ years of hands-on observability experience (Grafana required plus supporting tools)
2+ years practicing SRE fundamentals (SLOs/SLIs, incident patterns, distributed systems, reliability engineering)
5+ total years in SRE, Dev Ops, cloud, systems, platform, or monitoring engineering roles
Experience partnering with application teams to gather requirements and deliver solutions
Strong ability to explain complex concepts clearly to non-SRE partners

Nice to Have:

Experience with Thousand Eyes, App Dynamics, New Relic, or Sumo Logic
Familiarity with Azure, Kubernetes, CI and CD pipelines, or software delivery platforms
Experience contributing to observability standards at scale
Background in high uptime industries such as travel, finance, telecom, or cloud-based SaaS

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language