×
Register Here to Apply for Jobs or Post Jobs. X

Senior Site Reliability Engineer

Job in 242221, Gurugram, Uttar Pradesh, India
Listing for: Snapmint
Full Time position
Listed on 2026-06-17
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Systems Engineer, IT Support
Job Description & How to Apply Below
Senior Site Reliability Engineer (SRE)
Summary
We are looking for a Senior Site Reliability Engineer (SRE) to build and operate scalable, reliable, and secure platform infrastructure. The ideal candidate will drive automation, observability, incident management, and cloud-native best practices to improve system reliability and operational excellence across distributed systems.

Roles & Responsibilities
Define and manage SLIs, SLOs, and error budgets for critical services
Design and enhance monitoring, logging, alerting, and tracing capabilities
Automate operational processes and improve platform efficiency
Participate in incident response, root cause analysis (RCA), and postmortem reviews
Support production environments through on-call rotations and reliability initiatives
Improve system performance, scalability, availability, and capacity planning
Collaborate with engineering teams to enhance application resiliency and operational readiness
Drive adoption of Infrastructure as Code (IaC) and CI/CD best practices
Maintain highly available, fault-tolerant, and secure cloud infrastructure

Skills
Strong  Linux /Unix administration and Debugging skills
Proficiency in  Python/Bash/Shell  scripting and automation
Expertise in observability and monitoring tools such as  Grafana ,  Prometheus ,  ELK , and  New Relic
Strong expertise in  AWS  and cloud infrastructure management
Strong experience with log analysis and monitoring using ELK
Strong incident management, communication, and operational excellence mindset
Hands-on experience with Kubernetes, Docker, and container orchestration

Experience with Terraform and Infrastructure as Code practices
Strong understanding of networking, DNS, load balancing, and distributed systems

Experience with CI/CD tools such as Jenkins, Git Hub Actions, Git Lab CI, or ArgoCD

Qualifications
B.tech/B.E. Equivalent
4+ years of experience in SRE, Dev Ops, Platform Engineering, or Systems Engineering

Good to Have
Bachelor's degree in Computer Science, Engineering, or a related field
Cloud or Kubernetes certifications
Experience managing production incidents in high-availability environments
Exposure to multi-cloud architectures (AWS/GCP/Azure)
Position Requirements
10+ Years work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary