More jobs:
Job Description & How to Apply Below
Description
SRE Lead
Hybrid Model
Experience:
- 8+
Must Have:
- Pager Duty, Grafana, Open Search, AWS , Kubernetes
Requirements
Requirements
responsible for ensuring the availability, performance, resiliency, and operational stability of enterprise technology platforms. This role supports a predominantly AWS-based cloud environment that includes legacy Microsoft/.NET applications as well as modern Java-based services under active development.
The SRE will work closely with application development, infrastructure, security, and architecture teams to establish reliability standards, automate operational processes, and support the organization's ongoing cloud modernization efforts
Strong experience in SRE / Dev Ops practices
Hands-on with Pager Duty, Grafana, Open Search
Good knowledge of AWS & Terraform
Experience with Jira and Service Now
8+ years of experience in Site Reliability Engineering, Dev Ops, or Enterprise Operations
Demonstrated experience operating production workloads in AWS
Strong background supporting Microsoft technologies, including:
.NET / C# applications
Windows Server environments
Working knowledge of Java-based application platforms
Experience with Infrastructure as Code and CI/CD pipelines
Proficiency in scripting or programming (Power Shell, Python, Bash, or similar)
Strong analytical, troubleshooting, and communication skills
Job Responsibilities
Platform Reliability & Operations
Ensure the stability, availability, and performance of mission-critical production systems
Support both legacy Microsoft-based platforms and modern cloud-native services
Participate in scheduled on-call rotations and lead structured incident response activities
Conduct root cause analysis (RCA) and drive corrective and preventive actions
Define, measure, and report on service health metrics, SLAs, SLIs, and SLOs
Cloud Infrastructure Management
Design, deploy, and operate scalable and resilient infrastructure in Amazon Web Services (AWS)
Manage enterprise AWS services including compute, storage, networking, identity, and monitoring
Support hybrid architectures integrating AWS with on-premises or Microsoft-based systems
Apply infrastructure standards, security controls, and cost-management practices
Automation & Engineering Enablement
Develop and maintain Infrastructure as Code (IaC) using approved enterprise tooling
Automate deployment, configuration, scaling, and recovery processes
Partner with application teams to improve release reliability for .NET/C# and Java applications
Reduce manual operational effort through standardized automation and tooling
Monitoring, Observability & Compliance
Implement and maintain enterprise monitoring, logging, and alerting solutions
Ensure systems meet availability, performance, and compliance requirements
Create dashboards and operational reporting for leadership and stakeholders
Maintain runbooks, operational documentation, and support procedures
Continuous Improvement & Modernization
Support application modernization and cloud migration initiatives
Provide reliability and operability input into architecture and design reviews
Promote best practices in resilience, fault tolerance, and disaster recovery
Contribute to enterprise standards, patterns, and technical governance
Job responsibilities
Requirements
responsible for ensuring the availability, performance, resiliency, and operational stability of enterprise technology platforms. This role supports a predominantly AWS-based cloud environment that includes legacy Microsoft/.NET applications as well as modern Java-based services under active development.
The SRE will work closely with application development, infrastructure, security, and architecture teams to establish reliability standards, automate operational processes, and support the organization's ongoing cloud modernization efforts
Strong experience in SRE / Dev Ops practices
Hands-on with Pager Duty, Grafana, Open Search
Good knowledge of AWS & Terraform
Experience with Jira and Service Now
8+ years of experience in Site Reliability Engineering, Dev Ops, or Enterprise Operations
Demonstrated experience operating production workloads in AWS
Strong background supporting Microsoft technologies, including:
.NE…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×