Lead Site Reliability Engineer
Fairfax, Fairfax County, Virginia, 22032, USA
Listed on 2025-12-27
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability, IT Support
Quick Details
- Location: Fully Remote (US)
- Experience: 8+ Years
- Rate: $70-75/hour
- Duration: 6 months+
One Dynamic is a Service-Disabled Veteran-Owned Small Business (SDVOSB) headquartered in Fairfax, VA. We specialize in digital transformation, cloud infrastructure, quality assurance, and enterprise architecture for federal and healthcare organizations. We are currently seeking a Lead Site Reliability Engineer to support our client ARC, a rapidly growing device management company revolutionizing how frontline workers interact with enterprise mobile devices.
About the RoleThe Lead Site Reliability Engineer is a senior technical leadership role responsible for the reliability, availability, and operational excellence of the cloud infrastructure and kiosks platform. This role owns uptime, SLAs, and incident response while driving long‑term improvements to system resilience, observability, and operational maturity. The Lead SRE serves as both a hands‑on technical leader and a force multiplier across platform, QA, and development teams.
This role is well‑suited for an experienced engineer who thrives in high‑ownership environments and can balance real‑time operational demands with strategic reliability initiatives. Strong communication, sound technical judgment, and a bias toward preventative engineering are critical to success.
Key Responsibilities- Own uptime, SLAs, and overall reliability of the cloud infrastructure and kiosks platform
- Lead incident response, root‑cause analysis, and drive actionable postmortems
- Automate infrastructure, deployments, and operational tasks using modern IaC and scripting in collaboration with the Platform Engineering team
- Maintain and improve monitoring, alerting, and observability (e.g., Grafana, Prometheus, New Relic).
- Execute and continuously improve disaster recovery and business continuity plans
- Partner with platform engineering, QA, and development teams to ensure operational readiness
- Establish and maintain runbooks, operational standards, and reliability best practices
- Provide leadership, mentorship, and clear communication during both normal operations and incidents
- Optimize cloud and Kubernetes environments for reliability, performance, and scalability
- 8+ years in SRE, Dev Ops, or Platform Engineering roles; 2+ years in a senior or lead capacity
- Strong experience supporting production environments with strict SLAs and high uptime requirements
- Deep knowledge of Kubernetes, containers, and cloud‑native infrastructure
- Proficiency in automation and scripting using Bash, Python, or Go
- Hands‑on experience with CI/CD pipelines and release engineering in modern environments
- Expert‑level familiarity with IaC tools (Terraform preferred)
- Strong understanding of monitoring, alerting, logging, and observability tooling
- Experience implementing and managing Git Ops workflows (ArgoCD or similar)
- Demonstrated ability to lead incidents and communicate effectively with technical and non‑technical stakeholders
- Solid understanding of disaster recovery planning, resilience practices, and system hardening
- Must be authorized to work in the United States (US‑based candidates only)
You think several steps ahead. You are relentless, strategic, and a long‑term thinker. You believe the details are essential, and so you get them right. You are a fast learner. You take feedback well and implement it. You care about achieving the best outcome and do not focus on being right or wrong.
About the ClientARC is a device management solution integrated with smart lockers, designed to store, secure, and charge company‑owned handheld devices (E.g., Zebra, Honeywell) used by frontline workers to perform core job functions. Launched in late 2021, ARC was spun off from Charge It Spot , a consumer‑facing phone‑charging technology company established in 2012.
ARC's Mission: Minimize Device Waste. Maximize Worker Productivity. Make Life Easier.
How to ApplyIf you have the unique combination of skills and qualities we are seeking, please submit your resume via One Dynamic's careers portal. We look forward to hearing from you!
One Dynamic is an Equal Opportunity employer. Personnel are chosen based on ability without regard to race, color, religion, sex, national origin, disability, marital status, or sexual orientation, in accordance with federal and state law.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).