×
Register Here to Apply for Jobs or Post Jobs. X

Senior Associate – DR Recovery Lead; IT Operations

Job in New York, New York County, New York, 10261, USA
Listing for: New York Life Insurance Co
Full Time position
Listed on 2025-12-27
Job specializations:
  • IT/Tech
    Systems Engineer, Cybersecurity, Cloud Computing, IT Project Manager
Salary/Wage Range or Industry Benchmark: 60000 - 80000 USD Yearly USD 60000.00 80000.00 YEAR
Job Description & How to Apply Below
Position: Senior Associate – DR Recovery Lead (IT Operations)
Location: New York

Location Designation:
Hybrid – 3 days per quarter

As part of Technology, you’ll have the opportunity to contribute to groundbreaking initiatives that shape New York Life’s digital landscape. Leverage cutting‑edge technologies like Generative AI to increase productivity, streamline processes, and create seamless experiences for clients, agents, and employees. Your expertise fuels innovation, agility, and growth – driving the company’s success.

Role Summary

New York Life is standing up a repeatable, automation‑first Disaster Recovery (DR) operating model to ensure we can sustain a Minimum Viable Company (MVC) and recover priority services within 48 hours. As the DR Recovery Lead (IT Ops), you will be the single‑threaded owner for day‑to‑day DR operations‑driving orchestration execution, maintaining infra/app runbooks, coordinating cross‑tech teams and vendors, and ensuring audit‑ready evidence for quarterly exercises and an annual recovery test calendar.

You’ll also align DR with enterprise architecture and regulatory standards and continuously improve our capabilities.

What You’ll Do
  • Own DR operations & runbooks: Build, maintain, and continuously improve infrastructure and application recovery runbooks aligned to the enterprise DR framework and RACI.
  • Execute orchestrated recoveries: Lead automation‑first recovery using IaC/pipelines and evidence harness to capture artifacts, health checks, and outcomes for audit.
  • Plan & run tests: Lead quarterly tabletop/functional validations, drive an annual DR exercise calendar, and manage test evidence and acceptance with business owners.
  • Safeguard environments: Monitor configuration parity and drift; ensure DR capacity/readiness across failover patterns; coordinate change windows with APSO/CAB.
  • Restore securely: Coordinate restoration of IAM, keys/certs, and control re‑enablement in alignment with cyber‑incident procedures.
  • Recover data with integrity: Partner with DBA/Data teams on backup/restore or replication, validation, and reconciliation steps.
  • Prove service health: Define and run synthetic probes/SLIs/SLOs and publish dashboards to verify recoverability.
  • Manage vendors: Orchestrate third‑party SLAs, negotiate test windows, and validate contractual obligations and evidence.
  • Map & prioritize services: Maintain Critical Business Service (CBS) inventories and dependencies; scale playbooks across priority CBS.
  • Lead during incidents: Serve as DR operations lead for activation, coordinating comms and cross‑tech execution through recovery.
Added Focus Areas
  • Architectural alignment: Ensure DR strategies, patterns, and runbooks conform to enterprise architecture standards, reference architectures, and future‑state infrastructure plans; participate in design reviews and provide DR non‑functional requirements.
  • Multi‑cloud & cloud‑native DR: Engineer and operate DR solutions across on‑prem and multi‑cloud environments (e.g., AWS/Azure), leveraging cloud‑native patterns such as active/active, regional failover, immutable infrastructure, and serverless recovery.
  • Regulatory & compliance: Embed controls and evidence to meet NYDFS, SOX, GDPR, and related obligations; align to NIST (e.g., SP 800‑34/61) and ISO 22301 principles; maintain audit‑ready artifacts and traceability.
  • Continuous improvement & innovation: Drive quarterly improvement backlogs; pilot emerging techniques (e.g., chaos engineering/game days, AI‑assisted recovery validation), retire manual steps, and report ROI.
Qualifications
  • 8 years in IT Operations / SRE / DR or equivalent enterprise resiliency roles.
  • Hands‑on experience with DR patterns (active/active, active/passive), backup/restore & replication, and hybrid/multi‑cloud infrastructure.
  • Strong automation/IaC background (e.g., Terraform/Cloud Formation), CI/CD pipelines, and scripting (Power Shell, Bash, or Python).
  • Proven test planning & execution (tabletops through functional validation) with rigorous evidence capture.
  • Familiarity with security control restoration (IAM, PKI, secrets) and alignment to cyber‑incident runbooks.
  • Observability expertise (health checks, synthetic probes, SLIs/SLOs, dashboards).
  • Effective vendor management, change/incident coordination…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary