Director, Architect Enterprise Resilience & Recoverability
Job in
Bethesda, Montgomery County, Maryland, 20813, USA
Listed on 2026-06-23
Listing for:
Marriott International
Full Time
position Listed on 2026-06-23
Job specializations:
-
IT/Tech
Systems Engineer, Disaster Recovery IT, Cloud Computing: Infrastructure & Operations
Job Description & How to Apply Below
We are seeking a Director, Architect of Enterprise Resiliency & Recoverability to serve as the principal technical leader for how Marriott engineers, validates, and matures resiliency and disaster recovery across its global technology landscape. Reporting to the Senior Director of Enterprise Observability and Technology Resiliency & Recoverability, this role is the senior technical authority for both preventative resiliency and operational recoverability, ensuring that the systems our guests and properties depend on are resilient by design and recoverable by proof.
The Director owns the architectural and engineering discipline that keeps Marriott's most critical platforms resilient and recoverable role spans the full spectrum of modern resiliency practice - repeatable failover with verified transaction success, component-level recovery, automated DR validation, multi-region and active-active patterns, chaos engineering, and self-healing service design. The Director partners deeply with Enterprise Architecture, SRE, Infrastructure, Cloud, Network, Security, and Application Engineering teams to embed resiliency into how Marriott designs, deploys, and operates technology.
This is a hands-on engineering leadership role for a technical architect who can set direction, drive cross-domain remediation, and stand up as the technical authority during recovery exercises and live recovery events - not a people manager of engineers.
The right candidate is fluent in cloud-native resiliency patterns, multi-region architectures, chaos engineering, and modern recovery automation, and is equally comfortable in an architecture review, an executive readout, and a live recovery event.
This role is ideal for someone who:
* Translates deep technical knowledge of resiliency and recovery into architectural standards and business-aligned decisions
* Navigates ambiguity, matrixed organizations, and limited resources with clarity and conviction
* Leads through influence - setting standards, coaching engineers, and guiding remediation across teams without direct authority
* Balances strategic oversight with sleeves-rolled-up engineering, including direct contribution to recovery design, automation, and validation
* Thinks in systems: connects business transactions to SLOs, SLOs to architecture, and architecture to recovery outcomes
* Is energized by building engineered, continuously validated resilience at enterprise scale
CANDIDATE PROFILE
Required Experience and
Education:
* Bachelor's degree in Computer Science, Engineering, Information Systems, or a related discipline - or equivalent professional experience and certifications
* 8+ years of progressive experience in systems, infrastructure, cloud, or platform engineering within a large enterprise environment, including:
* 5+ years specifically in resiliency engineering, disaster recovery, or reliability engineering at scale
* Demonstrated experience as a senior technical authority - architect, principal engineer, or technical director - for enterprise resiliency and/or disaster recovery programs and for live recovery events
* Proven experience designing and validating end-to-end DR and high-availability architectures for enterprise-scale workloads across cloud (AWS, Azure, GCP, or Alibaba), hybrid, and on-premises environments
* Experience aligning technical recovery designs to business recovery objectives (RTO, RPO, business criticality) and translating between business impact and technical implementation
* Deep working knowledge of cloud-native resiliency patterns: multi-AZ and multi-region designs, redundancy and fault tolerance, automated failover, dynamic traffic management, and adaptive connectivity
* Strong recoverability foundation: backup and restore integrity, immutable and versioned backup, ransomware recovery frameworks, isolated recovery environments, and cross-region recovery patterns
* Familiarity with infrastructure-as-code and automation tooling (e.g., Terraform, Ansible, Cloud Formation) applied to DR orchestration, validation, and drift detection
* Experience with containerized and distributed systems, including Kubernetes, service mesh, and…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×