×
Register Here to Apply for Jobs or Post Jobs. X

Senior Reliability Engineer

Job in Alexandria, Fairfax County, Virginia, 22350, USA
Listing for: Leidos
Full Time position
Listed on 2026-03-14
Job specializations:
  • IT/Tech
    Systems Engineer, Cybersecurity
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below

Description

This Department of War enterprise data and analytics program delivers mission‑critical capabilities that enable leaders across the Department to make faster, better‑informed decisions using trusted data dos Digital Modernization sector is seeking an experienced Senior Reliability Engineer to support the delivery, enhancement, and adoption of enterprise data and analytics products used across multiple DoD organizations.

In this role, you will work alongside government partners, engineers, and other industry teammates to translate operational and strategic requirements into scalable, production‑ready solutions. You will contribute directly to product planning, execution, and continuous improvement—helping ensure capabilities are delivered efficiently, aligned to mission priorities, and positioned for sustained success.

This position offers the opportunity to work on a high‑visibility, enterprise program at the intersection of data, analytics, and emerging AI technologies. Ideal candidates are motivated by mission impact, comfortable operating in complex stakeholder environments, and interested in building deep domain expertise while delivering capabilities with real‑world national security outcomes.

Primary Responsibilities
  • Develop and implement strategies leveraging FOSS, COTS, and GOTS technologies to enhance the reliability, resiliency, and scalability of the platform.
  • Conduct lab‑based SWIL and HWIL testing to validate system performance and ensure components meet scalability and operational requirements.
  • Identify performance bottlenecks, analyze usage patterns, and recommend improvements to enhance system efficiency and scalability.
  • Identify, diagnose, and address recurring incidents, performing root cause analysis, and implementing preventative measures.
  • Produce and brief comprehensive resiliency and scalability assessments, providing insights into system behavior under load, failure modes, and recovery conditions.
  • Translate findings into inputs for SLAs and KPPs to support informed decision‑making by leadership.
  • Prepare, maintain, and execute a System Engineering Plan (SEP) for managing all systems architecture and system engineering related aspects of the program.
  • Conduct systems engineering activities required to specify, build, and maintain system engineering designs for the System.
  • Design, prepare, and document systems engineering and cybersecurity artifacts for the System.
  • Support the Government in recommending and conducting enterprise system architecture activities.
  • Define, document, maintain, and promulgate APIs and technical standards for using and interoperating within and outside the System.
  • Design, engineer, integrate, and continuously improve the underlying infrastructure of the System.
  • Identify, prepare, track, secure, and integrate government, commercial, and open‑source tools and services into the System.
  • Design, architect, engineer, and continuously improve the UI and UX components of the Platform.
  • Perform site reliability engineering to build and maintain a reliable, scalable, and efficient System by applying software engineering principles to operational tasks.
Basic Qualifications
  • Active Top Secret (TS) clearance with SCI eligibility.
  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or related technical discipline and 8–12 years of relevant experience OR Master’s degree in a related field and 6–10 years of relevant experience.
  • Experience engineering and supporting enterprise cloud environments (AWS, Azure, or GCP).
  • Experience implementing monitoring, observability, and performance management solutions.
  • Experience conducting root cause analysis and implementing systemic reliability improvements.
  • Experience integrating reliability engineering practices into Dev Sec Ops  pipelines.
  • Experience operating within SAFe or large‑scale Agile frameworks supporting enterprise systems.
  • Experience with FOSS, COTS, and GOTS technologies.
  • Proven experience in conducting SWIL and HWIL testing.
  • Strong understanding of system performance analysis and optimization.
  • Experience in root cause analysis and implementing preventative measures.
  • Ability to produce and brief comprehensive…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary