Sr. Site Reliability Engineer Job Austin area,Texas USA,IT/Tech

Senior Site Reliability Engineer (Sr. SRE)

Location :
Hybrid (1-2 days / week)

We are looking for a Senior Site Reliability Engineer (SRE) to help scale and operate highly available, cloud-based systems. In this role, you'll sit at the intersection of software engineering, Dev Ops, and platform reliability , ensuring our systems are resilient, observable, and built to perform at scale.

You'll lead incident response, drive automation, and partner closely with engineering teams to embed reliability into everything we build.

What You'll Do :

Own the reliability, availability, and performance of production systems
Lead incident response , on-call operations, and blameless post-mortems
Build and improve monitoring, alerting, logging, and observability
Define and manage SLIs, SLOs, and error budgets
Design and build automation and self-service tools to reduce toil
Support cloud infrastructure (AWS, Azure, GCP) using Infrastructure as Code
Improve CI / CD pipelines and deployment reliability
Partner with engineers on system design and architecture
Create runbooks and operational documentation
Mentor team members and promote SRE and Dev Ops best practices

What We're Looking For :

5+ years of experience in Site Reliability Engineering, Dev Ops, Platform, or Cloud Engineering
Strong Linux and production troubleshooting skills
Hands-on experience with AWS, Azure, or GCP
Proficiency in Python, Go, Java, Bash, or similar languages
Experience with Terraform, Ansible, or Infrastructure as Code
Experience supporting CI / CD pipelines and production deployments
Strong communication skills and a reliability-first mindset

Nice to Have :

Kubernetes and container orchestration experience
Observability tools like Prometheus, Grafana, Datadog, Splunk, or ELK
Experience with high-traffic, highly available systems
Knowledge of chaos engineering, error budgets, or AIOps
Cloud or Kubernetes certifications

Why Join Us :

Work on scalable, mission-critical platforms
Influence reliability and engineering best practices
Collaborative, blameless culture
Competitive compensation, benefits, and growth opportunities

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language