Site Reliability Engineer IV
Job in
Mountlake Terrace, Snohomish County, Washington, 98043, USA
Listed on 2026-06-26
Listing for:
Premera
Full Time
position Listed on 2026-06-26
Job specializations:
-
IT/Tech
SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Systems Engineer
Job Description & How to Apply Below
Mountlake Terrace WAtime type:
Full time posted on:
Posted 2 Days Agojob requisition :
R28853
** Workforce Classification:
** Hybrid
* * Join Our Team:
Do Meaningful Work and Improve People’s Lives
** Our purpose, to improve customers’ lives by making healthcare work better, is far from ordinary. And so are our employees. Working at Premera means you have the opportunity to drive real change by transforming healthcare.
Premera is committed to being a workplace where people feel empowered to grow, innovate, and lead with purpose. By investing in our employees and fostering a culture of collaboration and continuous development, we’re able to better serve our customers. It’s this commitment that has earned us recognition as one of the best companies to work for. Learn more about our recent awards and recognitions as a greatest workplace.
Learn how Premera supports our members, customers and the communities that we serve through our Healthsource blog:
** Site Reliability Engineer IV
**** Job Description Summary
** As a Site Reliability Engineer IV, you will drive reliability and operational excellence across cloud, on-premise, and hybrid platforms. You will build scalable automation and AI-powered tooling to improve system health, reduce manual effort, and accelerate incident response.
Partnering with software and platform engineering teams, you will standardize CI/CD, observability, and incident management practices, enabling resilient, self-healing systems. This role is critical to scaling engineering reliability and advancing intelligent automation across enterprise platforms.
** This is a hybrid role, located on our campus in Mountlake Terrace, Washington
**** What You’ll Do
*** Build, run, and optimize critical services across cloud, on-premise, and hybrid environments, including managed services, custom applications, and third-party integrations
* Develop automation and AI-powered tooling to reduce manual intervention, including anomaly detection, predictive alerting, and LLM-assisted diagnostics that surface actionable insights
* Design and implement end-to-end observability, telemetry, and self-healing capabilities across platforms
* Lead cross-team efforts to drive root cause analysis, post-incident reviews, and long-term reliability improvements
* Define and drive reliability strategy, standards, and best practices across engineering teams
* Standardize workflows for change management, deployment, and incident response, replacing manual processes with tooling-driven solutions
* Partner with engineering and security teams to ensure deployment pipelines and automation practices meet reliability, safety, and compliance standards
* Influence adoption of modern Dev Ops practices including CI/CD, infrastructure-as-code, and test-driven development
* Stay current on emerging technologies in AI/ML, Dev Ops, and platform engineering, and apply them to improve operational efficiency
* Participate in the on-call rotation and support production systems as needed
* This role does not involve day-to-day coding, it requires strong technical depth to guide teams, conduct rapid proofs of concept, and provide guidance on performance, reliability, cost and operational excellence.
** Minimum Qualifications
*** Bachelor’s degree in Computer Science, Information Systems, or related field — or equivalent experience
* 7+ years of experience in Site Reliability Engineering, Dev Ops, or IT Operations within complex environments
* Demonstrated experience leveraging AI platforms and tooling to design and build automation solutions.
** Preferred Qualifications
*** Hands on experience applying AI/ML to operational workflows, including anomaly detection, predictive alerting, or intelligent automation at scale
* Advanced experience with Kubernetes, Docker, and container-based platforms
* Deep expertise with event streaming platforms
* Experience working across cloud, on-premise, and hybrid environments
* Experience working in large-scale, regulated enterprise environments.
** Knowledge, Skills, and Abilities
*** Advanced troubleshooting across distributed systems and applications
* Proficiency in one or…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×