×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability​/Platform Engineer; Linux​/Kubernetes​/Python

Job in Reston, Fairfax County, Virginia, 22090, USA
Listing for: Career Developers
Full Time position
Listed on 2026-06-02
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, Cloud Computing, Network Engineer
Salary/Wage Range or Industry Benchmark: 180000 - 190000 USD Yearly USD 180000.00 190000.00 YEAR
Job Description & How to Apply Below
Position: Site Reliability/Platform Engineer (Linux/ Kubernetes / Python) - 180-190K

Site Reliability Engineer (Kubernetes / Open Shift Platform Engineering)

Location:

Reston, VA
Salary: 180-190K + 10% Bonus

Must have the following: on-prem Kubernetes engineering, Open Shift, Platform Engineering, Observability tools, Incident response, Automation, Production troubleshooting, Linux environments

Responsibilities:

  • Maintain the health, stability, and reliability of core technical platforms and platform services supporting business continuity and high availability.
  • Improve end-to-end platform observability to ensure system performance, incidents, and trends are proactively identified and addressed.
  • Lead incident response efforts, root-cause analysis, and postmortems to continuously improve platform reliability and reduce recurring issues.
  • Partner with development teams to troubleshoot deployment, routing, ingress, and configuration issues within Kubernetes/Open Shift environments.
  • Build and maintain automated deployment pipelines supporting engineering, development, and data teams.
  • Develop, test, and deploy automation solutions that reduce manual intervention and improve operational efficiency.
  • Lead the rollout of new platform services, features, and capabilities across hybrid infrastructure environments.
  • Operate and support platform services across on-premise infrastructure and Azure cloud services.
  • Maintain operational documentation, deployment procedures, incident response plans, and technical runbooks.
  • Participate in on-call rotation supporting production environments and critical infrastructure systems.
  • Assist with additional technical initiatives and operational responsibilities as needed.
Requirements:
  • Bachelor's degree in Computer Science or related field, or equivalent practical experience.
  • 4–5+ years of experience in Kubernetes Engineering, Site Reliability Engineering, Platform Engineering, or similar infrastructure-focused roles.
  • Strong hands-on Kubernetes engineering experience, including workload management, operators, routing/ingress, cluster administration, and performance management.
  • Experience managing and supporting Open Shift environments is highly preferred.
  • Experience deploying and supporting platform services and observability tooling.
  • Strong troubleshooting skills across logs, metrics, traces, packet captures, and Kubernetes debugging tools.
  • Strong understanding of observability platforms and connecting alerts, incidents, and operational trends to actionable outcomes.
  • Experience working within regulated or heavily audited environments preferred.
  • Strong communication skills with the ability to document technical procedures and operational activities thoroughly.
  • Ability to manage multiple priorities in a dynamic, fast-paced environment.
  • Strong collaboration skills with the ability to work effectively across engineering and infrastructure teams.
  • Experience conducting independent technical research and presenting findings to leadership and peers.
  • Proof of eligibility to work in the United States required.

Site Reliability Engineer, Kubernetes Engineer, Open Shift Engineer, Platform Engineer, Dev Ops Engineer, Kubernetes administration, Open Shift platform, cluster management, routing ingress, observability tools, Prometheus, Grafana, Datadog, incident response, production support, infrastructure engineer, automation engineer, CI/CD pipelines, platform reliability, troubleshooting Kubernetes, container orchestration, cloud infrastructure, Azure cloud, Linux systems, platform services, SRE jobs, enterprise infrastructure, root cause analysis, deployment automation, platform operations, production troubleshooting, hybrid infrastructure, site reliability, platform monitoring

Site Reliability Engineer, SRE, Open Shift engineer, Kubernetes engineer, Azure cloud engineer, platform engineer, Dev Ops engineer, observability, Grafana, Prometheus, Datadog, Hashi Corp Vault, Kafka, AMQ, Redis, CI/CD, automation, Bash scripting, Python scripting, cloud infrastructure, hybrid cloud, data center, reliability engineering, incident response, root cause analysis, container platform, cluster management, Azure infrastructure, production support, platform reliability, Dev Ops, monitoring tools, automation engineer, enterprise infrastructure, platform services, site reliability, cloud platform, Open Shift administrator, Kubernetes troubleshooting

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary