×
Register Here to Apply for Jobs or Post Jobs. X

VP, Reliability and Automation Engineering Manager

Job in Stamford, Fairfield County, Connecticut, 06925, USA
Listing for: Synchrony Financial
Full Time position
Listed on 2026-05-30
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, Cloud Computing
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below

Role Summary / Purpose

The VP, Reliability and Automation Engineering Manager leads the Reliability Engineering function responsible for keeping all applications within the Digital Servicing Engineering domain stable, observable and operationally efficient. This role defines how reliability engineering is practiced across teams – setting standards for service health, monitoring/alerting, incident response, problem management and operational readiness – while directly managing and developing a team of reliability engineers who execute those practices day to day.

This role requires strong people leadership plus deep hands‑on expertise in reliability engineering, architecture, CI/CD, incident management and operational excellence.

Essential Responsibilities
  • Ensure effective 24x7 operational coverage and on‑call practices, including staff readiness, rotations and escalation paths.
  • Oversee production defect analysis, troubleshooting and resolution; ensure fixes are implemented, validated and prevent recurrence.
  • Use structured problem‑solving to assess impact, mitigation options and resolution timelines during service events.
  • Lead, coach and develop a team of Reliability Engineers; set clear goals, performance expectations and career development plans.
  • Build and maintain staffing plans, hiring strategy, onboarding approach and succession planning for reliability and automation capabilities.
  • Create a culture of operational ownership, blameless learning, high engineering standards and continuous improvement.
  • Define and drive reliability engineering strategy across Digital Servicing Engineering, including client SLAs, reliability goals, operational KPIs and service health practices.
  • Establish and govern practices for incident response, escalation, problem management and root‑cause analysis, ensuring measurable reduction in repeat incidents.
  • Partner with engineering, architecture, security and product teams to embed reliability requirements into design, delivery and release processes.
  • Drive an automation‑first roadmap to reduce manual operational work (toil), improve deployment safety and increase operational consistency.
  • Standardize and improve reliability engineering patterns, including monitoring, alerting, runbooks and operational readiness criteria.
  • Provide technical oversight and direction across the full stack, aligning reliability engineering with modern architecture practices.
  • Ensure compliance with Synchrony architecture, security and technology standards.
  • Contribute to future‑state technology strategy, modernization, migration roadmaps and production support readiness.
  • Collaborate with software engineers, product managers, architects and customer application experts to deliver resilient customer communications applications.
  • Partner with third‑party vendors as needed to integrate software into Synchrony products while maintaining reliability and operational standards.
  • Perform other duties and/or special projects as assigned.
Qualifications / Requirements
  • Bachelor’s degree and a minimum of 6 years of experience with solution architecture.
  • Minimum of 10 years of application development experience.
  • Ability and flexibility to travel for business as required.
Desired Characteristics
  • Experience defining and managing SLIs/SLOs, error budgets, operational readiness reviews and reliability KPIs.
  • Strong background in automation‑first operations.
  • Experience establishing standards for monitoring/alerts quality.
  • Deep expertise with production telemetry and tooling like Splunk and New Relic.
  • Experience implementing log/metric/trace practices and dashboards that support fast triage and service recovery.
  • Strong understanding of modern application architecture and runtime environments such as PCF, AWS, microservices, REST APIs, React

    JS applications and iOS/Android native applications.
  • Ability to influence design decisions to improve reliability, scalability and supportability.
  • Experience delivering automation that uses AI/ML techniques for reliability outcomes.
  • Excellent interpersonal skills and proven track record influencing across a matrixed organization.
  • Desire to work in a dynamic, fast‑paced environment.
  • Experience developing and supporting financial/banking applications.
  • Strong attention to detail in a team environment.
Salary

The salary range for this position is $ – $ USD annually and is eligible for an annual bonus based on individual and company performance. Actual compensation offered within the posted salary range will be based upon work experience, skill level or knowledge.

Equal Employment Opportunity Statement

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or veteran status.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary