×
Register Here to Apply for Jobs or Post Jobs. X

AVP, Reliability Engineer - OnePay

Job in Alpharetta, Fulton County, Georgia, 30239, USA
Listing for: Synchrony
Full Time position
Listed on 2026-06-06
Job specializations:
  • IT/Tech
    IT Support, Systems Engineer, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Role Summary/

Purpose:

The AVP, Reliability Engineer – One Pay plays a pivotal technical role within Synchrony Financial to ensure high availability, stability, security, and performance of applications supporting One Pay integrations. In order to provide operational excellence in a highly regulated environment, this role provides technical expertise and rigor to identify and remediate failures or looming issues that could negatively impact customer and partner experiences or prevent adherence to SLAs.

The ideal candidate excels at problem analysis, troubleshooting methods, and situational awareness within the context of distributed systems.

This is a hands‑on technologist role requiring exposure to SRE and Dev Ops technology stacks and strong understanding of application support processes, including monitoring and addressing incidents/alerts across engineering applications and ensuring effective coordination and handoffs with vendors, partners, and internal Synchrony teams. The role also develops automation and leverages AIOps approaches to detect gaps, monitor trends, reduce operational toil, and expedite response and remediation.

Essential Responsibilities:
  • Drive investigations with cross‑functional teams to understand failures, analyze production defects, troubleshoot systems, identify root cause, and implement fixes to prevent recurrence.
  • Ensure the dependability, availability, and scalability of One Pay‑integrated applications and services by partnering with application, platform, and infrastructure teams.
  • Enhance observability, including establishing and maintaining dashboards and monitoring capabilities (e.g., Splunk, New Relic, and similar tools), improving alert quality, and strengthening operational readiness.
  • Design and implement monitoring, alerting, and metrics to track and report adherence to service SLAs/SLOs, performance, and operational efficiency.
  • Develop automation and leverage AIOps to detect reliability gaps, monitor trends, reduce noise, and expedite incident response and restoration activities.
  • Continuously monitor the health and performance of engineering applications, production servers, and key service indicators; provide monitoring/reporting and recommendations as needed.
  • Support release and operational processes, including troubleshooting CI/CD pipeline issues (e.g., Jenkins pipelines) and coordinating releases with partner teams.
  • Participate in Agile sprints with cross‑functional teams involving multiple technologies, personnel, and processes; contribute reliability requirements and improvements that support continuous delivery.
  • Support a root cause analysis discipline and continuous improvement practices that reduce downtime and increase resiliency.
  • Coordinate effectively with vendor partner teams and Synchrony teams to ensure seamless support handoffs and timely issue resolution.
  • Communicate the status of technical stacks, incidents, risks, and reliability initiatives to stakeholders and leadership, including partner‑facing stakeholders as appropriate.
  • Work closely with an experienced staff comprising both Synchrony resources and third‑party contractors.
  • Participate in an on‑call rotation to respond to critical production issues.
  • Perform other duties and/or special projects as assigned.
Qualifications/Requirements:
  • Bachelor’s degree and a minimum of 5 years of relevant experience in application development, reliability engineering, systems engineering, and/or production application support (or equivalent practical experience) OR in lieu of a Degree, High School Diploma/GED and a minimum of 8+ years of experience of relevant experience.
  • Demonstrated experience troubleshooting and supporting distributed systems in cloud environments.
  • Good understanding of the nature of distributed systems and cloud providers.
  • Solid understanding of cloud concepts such as containerization, message queues, load balancing, data replication, and high availability patterns.
  • Understanding of IT application support processes, including incident management, problem resolution, and operational/support metrics used for decision‑making.
  • Knowledgeable in UNIX Operating System fundamentals.
  • Familiar with network…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary