Site Reliability Engineer
Listed on 2026-07-02
-
IT/Tech
Systems Engineer, Cloud Computing: Infrastructure & Operations, SRE/Site Reliability, IT Support
Site Reliability Engineer (SRE)
GDIT is seeking a Site Reliability Engineer (SRE) to help ensure the resilience, performance, and reliability of mission-critical Defense systems. In this role, you will blend software engineering, automation, and operations expertise to build scalable platforms, reduce toil, and enable high-velocity delivery.
How You'll Make an Impact:
- Build/Design and maintain highly available, scalable systems across cloud and on-prem environments.
- Develop automation solutions that improves observability, speeds recovery, and eliminates manual operational work.
- Implement monitoring, alerting, and performance tuning strategies that ensure system health.
- Collaborate with development and infrastructure teams to design reliable architectures and CI/CD pipelines.
- Conduct root cause analysis and drive systemic improvements to prevent future incidents.
- Champion SRE best practices such as SLIs/SLOs, error budgets, and automated incident response.
- Provide inputs into proposal operations in area of subject matter expertise, collaborating on solution elements and providing written narratives that describe technical solution elements designed for a specific opportunity
What You'll Need to Succeed:
- Required
Work Experience:
15+ years in this space; system reliability, Dev Sec Ops , cloud operations, or infrastructure engineering. - Education:
Bachelor's with 15 years or an additional 4 years of work experience in lieu of degree - Strong scripting and automation skills (Python, Bash, Power Shell, etc.).
- Hands-on experience with monitoring tools (Prometheus, Grafana, Splunk, ELK, Datadog, etc.).
- Familiarity with Kubernetes, container orchestration, and modern CI/CD pipelines.
- Understanding of networking, Linux system internals, and distributed systems.
- Ability to troubleshoot complex technical issues across the stack.
- US Citizenship Required
- Candidate must possess active secret to start, and ability to attain Top Secret/SCI
- Preferred Experience supporting DoD or other federal programs.
- Certifications such as Kubernetes (CKA/CKAD), AWS/Azure, or ITIL.
- Experience implementing SRE frameworks at scale.
- Location & Travel
Location:
Remote Travel= 25-50%
The likely salary range for this position is $164,382 - $215,050. This is not, however, a guarantee of compensation or salary. Rather, salary will be set based on experience, geographic location and possibly contractual requirements and could fall outside of this range. Scheduled Weekly
Hours:
40
Travel Required:
25-50% Telecommuting Options:
Remote
Work Location:
Any Location / Remote
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).