×
Register Here to Apply for Jobs or Post Jobs. X

Systems Reliability Engineer

Job in Burke, Fairfax County, Virginia, 22015, USA
Listing for: Leidos
Full Time position
Listed on 2026-02-18
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, IT Support, Cybersecurity
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Systems Reliability Engineer at Leidos summary:

The Systems Reliability Engineer at Leidos is responsible for ensuring the availability and performance of GEOAxIS ICAM services by troubleshooting incidents, performing root cause analysis, and implementing automated monitoring and remediation solutions. The role requires expertise in COTS product integration, scripting in Linux environments, and familiarity with Dev Ops tools and container technologies. Candidates must hold a TS/SCI clearance, have strong communication skills, and be able to work onsite in Chantilly, VA with the ability to support off‑hour calls for high‑priority incidents.

Description

GEOAxIS is looking for a Systems Reliability Engineer to work with the rest of the operations team to help drive program technical execution, innovation and modernization.

The GEOAxIS system provides Identity, Credential and Access Management for all web applications. GEOAxIS enables online, on‑demand access to NGA GEOINT content based on user’s authoritative attributes/roles. Our Mission is to maintain highly available ICAM services for protecting those critical mission applications across all security domains. The GxNext contract was awarded to Leidos in 2021 and runs until 2031.

Responsibilities
  • Troubleshoot and resolve system/operational incidents

  • Perform root cause analysis for operational incidents

  • Analyze system performance and take corrective actions as needed

  • Coordinate with mission partners, consumer applications, and other external entities in troubleshooting enterprise incidents and integration problems

  • Design, develop, and implement automated solutions to proactively monitor system health, identify performance bottlenecks, and resolve system issues through automated remediation, reducing manual intervention and improving system reliability.

  • Collect data, identify and analyze trends in Operational Incidents, and provide suggestions to mitigate common issues

  • Work closely with Ops Tech Lead and Development Lead to identify baseline enhancements to improve operational stability

  • Work with deployment and ISP teams to support baseline deployments to operations

  • Willingness to support off‑hour calls to assist in troubleshooting when high priority operational incidents occur

Requirements/Qualifications
  • BS degree and 4+ years of prior relevant experience or Masters with 2+ years of prior relevant experience.

  • Requires a TS/SCI and ability to obtain and maintain a Polygraph post hire

  • Strong communication skills, both verbal and written

  • Ability to quickly learn new software and IT concepts

  • Strong problem solving and decision making skills

  • Self‑starter with an ability to work in a team environment and independently

  • Intimately familiar with the COTS products that the program leverages:
    Oracle Identity and Access Management (IdAM) suite, Apache webgates, and Computer Associates (CA) API Gateway

  • Experience scripting in a Linux environment using Shell and Bash

  • Deep understanding and background in COTS integration and custom code development

  • Experience in at least one of the following languages:

  • Bash

  • Python

  • Java

  • NodeJS

  • Local to DMV (DC/Maryland/Virginia) with ability to be physically present at the team’s work location in Chantilly

  • Strong interpersonal skills and proven track record of leading technical teams, conveying technical solutions to technical and non‑technical audiences

  • Candidate must be able to physically be in Chantilly, VA a minimum of 5 days a week to work with the team with occasional meetings in Reston and/or Springfield, VA

  • All candidates must be US CITIZENS to be considered for the position

  • Security+ certification within 60 days of hire

Preferred
  • Kubernetes experience using Rancher RKE2 or Openshift

  • Strong understanding of containers

  • Experience containerizing existing custom software

  • Knowledge of common Dev Ops tools such as:

  • Ansible

  • ArgoCD

  • Gitlab

  • Nexus3

  • Kubernetes

  • Certifications in any of the following:

  • RHCSA/RHCE

  • AWS Solutions Architect/Dev Ops Engineer

  • CKA/CKAD

  • Familiarity with modern authentication flows such as SAML, OAuth2 and OIDC

At Leidos, we don’t want someone who "fits the mold"—we want someone who melts it down and builds something better. This is a role for the…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary