×
Register Here to Apply for Jobs or Post Jobs. X

Secure Reliability Engineering Manager

Job in Reston, Fairfax County, Virginia, 22090, USA
Listing for: SAP SE
Full Time position
Listed on 2026-02-08
Job specializations:
  • IT/Tech
    Cybersecurity, Cloud Computing, Systems Engineer, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 220200 - 374200 USD Yearly USD 220200.00 374200.00 YEAR
Job Description & How to Apply Below

Overview

We are seeking an experienced Secure Reliability Engineering (SRE) Manager to lead the reliability, resilience, and secure operation of a sovereign cloud platform supporting regulated and high-trust workloads. This role is responsible for ensuring that availability, performance, and security are engineered into the platform by design, using Terraform-driven Infrastructure as Code (IaC), cloud-native services, and open-source technologies.

The ideal candidate brings deep technical credibility in cloud reliability engineering, strong people leadership, and a security-first mindset—treating security, compliance, and sovereignty as core reliability requirements, not afterthoughts.

Key Responsibilities

Platform Reliability & Architecture

  • Own the reliability, availability, and resilience of sovereign cloud platforms supporting regulated workloads across hyperscalers (AWS, Azure, GCP, and sovereign variants)
  • Design and enforce secure information and failure boundaries, including:
    • Network segmentation and fault isolation
    • Identity, access, and privilege separation
    • Data residency, encryption, and key management controls
    • Define and manage Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets aligned with sovereign and regulatory requirements
    • Partner with Security, Architecture, and Compliance teams to ensure reliability designs meet sovereignty, regulatory, and contractual obligations

Infrastructure as Code & Reliability Automation

  • Lead development and governance of Terraform-based IaC frameworks with reliability and security baked in
  • Establish reusable modules, standards, and pipelines for:
    • Cloud-native services (compute, storage, networking, identity)
    • Built-in resilience patterns (multi-zone, multi-region, failover)
    • Embedded security and compliance controls
    • Provisioning and configuration
    • Drift detection and remediation
    • Capacity management and lifecycle operations

Secure SRE Operations

  • Build and operate reliability-focused CI/CD pipelines for infrastructure and platform services
  • Lead operational practices including:
    • Monitoring, logging, tracing, and alerting
    • Incident response, root cause analysis, and post-incident reviews
    • Change, release, and reliability risk management
    • Reduce toil through automation while maintaining strict security and change controls

Security, Compliance & Operational Assurance

  • Implement security-by-default and resilience-by-design practices across all environments
  • Ensure operational alignment with frameworks such as:
    • Zero Trust architecture
    • NIST, ISO, SOC, or equivalent regulatory standards
  • Support audits and assessments by delivering traceable, code-driven controls, operational evidence, and reliability metrics
  • Treat compliance gaps, security weaknesses, and reliability risks as production-impacting issues

Cloud-Native & Open-Source Technologies

  • Govern and operate cloud-native and open-source platforms such as:
    • Ensure platforms are secure, observable, resilient, and supportable
    • Evaluate emerging technologies that improve reliability, security posture, and operational efficiency

People Leadership & Reliability Culture

  • Lead, mentor, and grow a team of Secure Reliability Engineers
  • Establish an SRE culture focused on:
    • Blameless incident response
    • Strong operational ownership
    • Define clear roadmaps, reliability goals, and success metrics aligned with business and sovereign requirements
Required Qualifications
  • 10+ years of experience in SRE, Dev Ops, Cloud Engineering, or Platform Engineering
  • 4+ years of experience leading or managing technical teams
  • Deep hands-on experience with Terraform in production, regulated environments
  • Strong experience with at least one major cloud provider (AWS, Azure, GCP)
  • Proven experience designing highly available, secure, and isolated cloud platforms
  • Strong understanding of:
    • Cloud security fundamentals (IAM, encryption, network security, secrets management)
    • Reliability engineering concepts (SLOs, error budgets, incident management)
    • Experience with CI/CD, observability, and automation tooling
Preferred Qualifications
  • Experience supporting sovereign, government, or highly regulated environments
  • Kubernetes platform reliability experience in security-sensitive contexts
  • Fam…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary