More jobs:
Site Reliability Engineer
Job in
Dallas, Dallas County, Texas, 75215, USA
Listed on 2025-12-18
Listing for:
Omega Hires
Full Time
position Listed on 2025-12-18
Job specializations:
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability
Job Description & How to Apply Below
Site Reliability Engineer (SRE)
Location:
Dallas, TX
- Azure Dev Ops (YAML, ARM)
- Azure Kubernetes Service
- Kubernetes (open source)
- Docker
- Partner with the architecture and development teams on how to make applications highly available, reliable, and performant at global scale.
- Collaborate with the architecture team to ensure reliability factors are accounted for in business features and enablers.
- Guide development teams in understanding established service level objectives and consequences, implementing appropriate SLIs to support the objectives.
- Collaborate with development team members to swarm, troubleshoot, and resolve problems.
- Guide ad‑hoc teams to brainstorm solutions and build implementation plans based on root cause analysis of production issues.
- Design and build automated solutions to optimize application/service/platform uptime with minimal human intervention.
- Be available for an on‑call rotation to participate in troubleshooting and communication efforts outside of normal business hours.
- Implement and help create standards and best practices, and mentor other team members to drive adoption across development teams.
- Perform other duties as assigned.
- Conform with all company policies and procedures.
- Expert in defining, implementing, and evaluating Service Level Objectives (SLO) and Service Level Indicators (SLI), and associated consequences.
- Software development expertise in two or more high‑level programming and scripting languages.
- Experience in evolutionary database design, query performance analysis, and indexing as a cornerstone for delivering scalable, performant products and services.
- Experience in designing, building, and optimizing automated pipelines with automated testing and automated security controls.
- Experience in performing root cause analysis and problem management.
- Experience working in Agile Scrum teams with demonstrated success leading improvements (getting better/faster/happier).
- Help establish and maintain a culture of learning through the development and sharing of skills, knowledge, process and tools; combat traditional silos that create “us and them” environments.
- A driving passion for finding solutions to hard problems at scale and operationalizing them.
- Exceptional critical thinking and communication skills, with a passion for leveraging documentation as a tool for constant improvement.
Skills and Abilities
- Pipeline Automation:
Azure Dev Ops (YAML, ARM), Terraform, Jenkins, Chef, Octopus Deploy. - Code Scanning:
Sonar Qube, Checkmarx. - Source Code repos:
Git. - Containerization:
Azure Kubernetes Service, Kubernetes (open source), Docker. - High‑level programming languages:
Java, C# (.NET MVC and .NET Core), Go. - Scripting:
Power Shell, Bash. - Database:
Oracle, Microsoft SQL Server, No
SQL (e.g., Cosmos
DB). - Test Automation:
Xamarin.
UITest, Specflow, Dev Test, Selenium, Test Data Manager, Postman, Maven, TestNG, JMeter. - Operating systems:
Windows, Linux. - Cloud platforms:
Azure. - Metrics and Monitoring:
Splunk.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×