×
Register Here to Apply for Jobs or Post Jobs. X

Secure Reliability Engineering Manager

Job in Reston, Fairfax County, Virginia, 22090, USA
Listing for: SAP Belgium NV/SA
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    Cybersecurity, Cloud Computing, Systems Engineer, IT Support
Salary/Wage Range or Industry Benchmark: 220200 - 374200 USD Yearly USD 220200.00 374200.00 YEAR
Job Description & How to Apply Below

We help the world run better

At SAP, we keep it simple: you bring your best to us, and we'll bring out the best in you. We're builders touching over 20 industries and 80% of global commerce, and we need your unique talents to help shape what's next. The work is challenging – but it matters. You'll find a place where you can be yourself, prioritize your wellbeing, and truly belong.

What's in it for you? Constant learning, skill growth, great benefits, and a team that wants you to grow and succeed.

“Due to the potentially classified nature of our work, your willingness is required to subject yourself to a governmental security clearance process.”

Overview

We are seeking an experienced Secure Reliability Engineering (SRE) Manager to lead the reliability, resilience, and secure operation of a sovereign cloud platform supporting regulated and high-trust workloads. This role is responsible for ensuring that availability, performance, and security are engineered into the platform by design, using Terraform-driven Infrastructure as Code (IaC), cloud-native services, and open-source technologies.

The ideal candidate brings deep technical credibility in cloud reliability engineering, strong people leadership, and a security-first mindset—treating security, compliance, and sovereignty as core reliability requirements, not afterthoughts.

Key Responsibilities Platform Reliability & Architecture
  • Own the reliability, availability, and resilience of sovereign cloud platforms supporting regulated workloads across hyperscalers (AWS, Azure, GCP, and sovereign variants)

  • Design and enforce secure information and failure boundaries, including:

  • Network segmentation and fault isolation

  • Identity, access, and privilege separation

  • Data residency, encryption, and key management controls

  • Define and manage Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets aligned with sovereign and regulatory requirements

  • Partner with Security, Architecture, and Compliance teams to ensure reliability designs meet sovereignty, regulatory, and contractual obligations

Infrastructure as Code & Reliability Automation
  • Lead development and governance of Terraform-based IaC frameworks with reliability and security baked in

  • Establish reusable modules, standards, and pipelines for:

  • Cloud-native services (compute, storage, networking, identity)

  • Built-in resilience patterns (multi-zone, multi-region, failover)

  • Embedded security and compliance controls

  • Drive automation for:

  • Provisioning and configuration

  • Drift detection and remediation

  • Capacity management and lifecycle operations

Secure SRE Operations
  • Build and operate reliability-focused CI/CD pipelines for infrastructure and platform services

  • Lead operational practices including:

  • Monitoring, logging, tracing, and alerting

  • Incident response, root cause analysis, and post-incident reviews

  • Change, release, and reliability risk management

  • Reduce toil through automation while maintaining strict security and change controls

Security, Compliance & Operational Assurance
  • Implement security-by-default and resilience-by-design practices across all environments

  • Ensure operational alignment with frameworks such as:

  • Zero Trust architecture

  • NIST, ISO, SOC, or equivalent regulatory standards

  • Support audits and assessments by delivering traceable, code-driven controls, operational evidence, and reliability metrics

  • Treat compliance gaps, security weaknesses, and reliability risks as production-impacting issues

Cloud-Native & Open-Source Technologies
  • Govern and operate cloud-native and open-source platforms such as:

  • Kubernetes, Helm, Argo, Vault, Open Policy Agent

  • Ensure platforms are secure, observable, resilient, and supportable

  • Evaluate emerging technologies that improve reliability, security posture, and operational efficiency

People Leadership & Reliability Culture
  • Lead, mentor, and grow a team of Secure Reliability Engineers

  • Establish an SRE culture focused on:

  • Blameless incident response

  • Continuous improvement

  • Strong operational ownership

  • Define clear roadmaps, reliability goals, and success metrics aligned with business and sovereign requirements

Required Qualifications
  • 10 years of experience in SRE, Dev Ops, Cloud…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary