More jobs:
Site Reliability Engineer
Job in
Deerfield, Lake County, Illinois, 60063, USA
Listed on 2026-06-05
Listing for:
Tata Consultancy Service Limited
Full Time
position Listed on 2026-06-05
Job specializations:
-
IT/Tech
Systems Engineer, IT Support, Cloud Computing, Cybersecurity
Job Description & How to Apply Below
- 7+ years of experience in SRE, platform engineering, or cloud infrastructure engineering in large-scale enterprise environments
- Deep, hands-on expertise with Microsoft Azure minimum 4 years in a primary Azure cloud engineering role.
- Expert-level proficiency with AKS: cluster lifecycle management, RBAC, network policies, pod security standards, cluster autoscaler, and Workload Identity.
- Strong experience in Microservices development using Java and perform CI/CD using Azure Dev Ops (ADO)
- Experience designing and operating enterprise observability platforms using Dynatrace
- Demonstrable track record of owning SLOs/SLIs and delivering measurable reliability improvements in production.
- Define, own, and enforce enterprise-wide SLOs, SLIs, and Error Budgets across all Tier-0 and Tier-1 Azure-hosted services, report SLA compliance to executive stakeholders monthly.
- Lead architectural reviews for new services and ensure reliability non-functionals (availability targets, RTO/RPO) are embedded from design through to production.
- Champion and implement chaos engineering practices
- Drive Disaster Recovery (DR) design and conduct quarterly DR drills across Azure paired regions.
- Incident Management & On-Call Serve as Incident Commander for P1/P2 major incidents, own end-to-end incident lifecycle from detection through resolution and Post-Incident Review (PIR).
- Participate in a structured On-Call rotation with follow-the-sun global coverage; maintain response SLAs of
- Drive blameless post-mortem culture and ensure all action items from PIRs are tracked and delivered within agreed SLA.
• Design and operate the enterprise observability stack:
Dynatrace Logs, Alerts, Dashboard ensure full MELT (Metrics, Events, Logs, Traces) coverage.
• Build and maintain alerting frameworks in Dynatrace with Pager Duty and Service Now. - Develop and operate platform automation, runbooks, and self-healing capabilities using Azure Automation, Logic Apps, and Python/Power Shell scripting.
- Discretionary Annual Incentive.
- Comprehensive Medical Coverage:
Medical & Health, Dental & Vision, Disability Planning & Insurance, Pet Insurance Plans. - Family Support:
Maternal & Parental Leaves. - Insurance Options:
Auto & Home Insurance, Identity Theft Protection. - Convenience & Professional Growth:
Commuter Benefits & Certification & Training Reimbursement. - Time Off:
Vacation, Time Off, Sick Leave & Holidays. - Legal & Financial Assistance:
Legal Assistance, 401K Plan, Performance Bonus, College Fund, Student Loan Refinancing.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×