More jobs:
Sr. Site Reliability Engineer
Job in
Austin, Travis County, Texas, 78716, USA
Listed on 2025-12-30
Listing for:
Providence Partners, LLC
Part Time
position Listed on 2025-12-30
Job specializations:
-
IT/Tech
SRE/Site Reliability, Systems Engineer
Job Description & How to Apply Below
Senior Site Reliability Engineer (Sr. SRE)
Location :
Hybrid (1-2 days / week)
We are looking for a Senior Site Reliability Engineer (SRE) to help scale and operate highly available, cloud-based systems. In this role, you'll sit at the intersection of software engineering, Dev Ops, and platform reliability , ensuring our systems are resilient, observable, and built to perform at scale.
You'll lead incident response, drive automation, and partner closely with engineering teams to embed reliability into everything we build.
What You'll Do :- Own the reliability, availability, and performance of production systems
- Lead incident response , on-call operations, and blameless post-mortems
- Build and improve monitoring, alerting, logging, and observability
- Define and manage SLIs, SLOs, and error budgets
- Design and build automation and self-service tools to reduce toil
- Support cloud infrastructure (AWS, Azure, GCP) using Infrastructure as Code
- Improve CI / CD pipelines and deployment reliability
- Partner with engineers on system design and architecture
- Create runbooks and operational documentation
- Mentor team members and promote SRE and Dev Ops best practices
- 5+ years of experience in Site Reliability Engineering, Dev Ops, Platform, or Cloud Engineering
- Strong Linux and production troubleshooting skills
- Hands-on experience with AWS, Azure, or GCP
- Proficiency in Python, Go, Java, Bash, or similar languages
- Experience with Terraform, Ansible, or Infrastructure as Code
- Experience supporting CI / CD pipelines and production deployments
- Strong communication skills and a reliability-first mindset
- Kubernetes and container orchestration experience
- Observability tools like Prometheus, Grafana, Datadog, Splunk, or ELK
- Experience with high-traffic, highly available systems
- Knowledge of chaos engineering, error budgets, or AIOps
- Cloud or Kubernetes certifications
- Work on scalable, mission-critical platforms
- Influence reliability and engineering best practices
- Collaborative, blameless culture
- Competitive compensation, benefits, and growth opportunities
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×