×
Register Here to Apply for Jobs or Post Jobs. X

Azure SRE: Reliability, Observability & Incident Leadership

Job in Deerfield, Lake County, Illinois, 60063, USA
Listing for: Veriipro
Full Time position
Listed on 2026-05-28
Job specializations:
  • IT/Tech
    Cloud Computing, SRE/Site Reliability, Systems Engineer, IT Support
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

We are looking for an experienced Site Reliability Engineer (SRE) to ensure the reliability, availability, and performance of Azure-based services in a large-scale enterprise environment. This role involves managing cloud infrastructure, enhancing observability, implementing disaster recovery strategies, and driving reliability improvements through SLOs/SLIs and automation.

Key Responsibilities
  • Define and manage SLOs, SLIs, and Error Budgets for Azure-hosted services, reporting SLA compliance to stakeholders.
  • Lead architectural reviews, ensuring reliability targets (availability, RTO/RPO) are met from design to production.
  • Implement chaos engineering practices and conduct disaster recovery drills across Azure regions.
  • Serve as Incident Commander for P1/P2 incidents, owning the incident lifecycle and post-mortem actions.
  • Design and operate enterprise observability using Azure Monitor, Log Analytics, Application Insights, and Grafana.
  • Develop alerting frameworks and automate self-healing operations with Azure Automation and scripting (Python/Power Shell).
  • Embed reliability gates in CI/CD pipelines and manage AKS cluster reliability (scaling, upgrades, security).
  • Enforce infrastructure-as-code best practices with Terraform/Bicep for Azure Landing Zones.
Required Qualifications
  • 7+ years in SRE, platform engineering, or cloud infrastructure in large-scale environments.
  • 4+ years of hands-on Azure experience with AKS and cloud engineering.
  • Expertise in Terraform (required), Bicep, and managing Azure Landing Zones.
  • Proficiency in Python, Go, or Power Shell scripting.
  • Experience with Azure observability tools (Monitor, Log Analytics, Application Insights).
  • Proven track record of owning SLOs/SLIs and improving production reliability.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary