×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer

Job in Deerfield, Lake County, Illinois, 60063, USA
Listing for: Diverse Lynx
Full Time position
Listed on 2026-06-01
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, IT Support, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 120000 - 140000 USD Yearly USD 120000.00 140000.00 YEAR
Job Description & How to Apply Below
Job Title: Site Reliability Engineer
Location: Deerfield, IL
Duration:
Fulltime


Skills: Azure
Salary: $120K - $140K/Year


Must Have Technical/Functional Skills:
  • 7+ years of experience in SRE, platform engineering, or cloud infrastructure engineering in large-scale enterprise environments (10,000+ employees or equivalent complexity).
  • Deep, hands-on expertise with Microsoft Azure minimum 4 years in a primary Azure cloud engineering role.
  • Expert-level proficiency with AKS: cluster lifecycle management, RBAC, network policies, pod security standards, cluster autoscaler, and Workload Identity.
  • Strong infrastructure-as-code skills:
    Terraform (required) and/or Bicep; experience managing Azure Landing Zones or Enterprise-Scale architecture.
  • Proficiency in at least one systems programming/scripting language:
    Python (preferred), Go, or Power Shell.
  • Experience designing and operating enterprise observability platforms using Azure Monitor, Log Analytics and Application Insights at scale.
  • Demonstrable track record of owning SLOs/SLIs and delivering measurable reliability improvements in production.
  • Strong knowledge of enterprise networking in Azure:
    Hub-and-Spoke/Virtual WAN, Express Route, Azure Firewall, NSGs, Private Endpoints, and DNS Private Zones.
Required/Preferred Certifications:
  • AZ-104 | AZ-305 (Preferred) | AZ-400 (Preferred) | CKA | ITIL v4 Foundation

Roles and Responsibilities:


Reliability & Availability Engineering
  • Define, own, and enforce enterprise-wide SLOs, SLIs, and Error Budgets across all Tier-0 and Tier-1 Azure-hosted services; report SLA compliance to executive stakeholders monthly.
  • Lead architectural reviews for new services and ensure relia bility non-functionals (availability targets, RTO/RPO) are embedded from design through to production.
  • Champion and implement chaos engineering practices using Azure Chaos Studio and custom fault injection frameworks to proactively surface reliability risks.
  • Drive Disaster Recovery (DR) design and conduct quarterly DR drills across Azure paired regions. Incident Management & On-Call
  • Serve as Incident Commander for P1/P2 major incidents, own end-to-end incident lifecycle from detection through resolution and Post-Incident Review (PIR).
  • Participate in a structured On-Call rotation with follow-the-sun global coverage; maintain response SLAs of
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary