Position
Description:
This role is for a highly skilled Production Support Engineer with Site Reliability Engineering (SRE) expertise, responsible for ensuring the availability, stability, performance, and resiliency of mission‑critical mainframe-based payment systems. The engineer will provide deep production support, proactive monitoring, incident management, and reliability improvements for high-volume, low-latency payment platforms.
The ideal candidate brings strong mainframe production support experience, solid payments and banking domain knowledge, and a mindset aligned with SRE principles, including automation, monitoring, continuous improvement, and operational excellence.
This role is hybrid and requires you to be at office at a minimum 4 days per week - subject to change at any time.
Your future duties and responsibilities:
• Production Support & Reliability
Own 24x7 production support for mainframe-based payment applications, ensuring high availability and minimal downtime.
Monitor system health, performance, and capacity using enterprise monitoring tools; proactively identify and mitigate risks.
Act as an L2/L3 support engineer for critical incidents, performing root cause analysis (RCA) and driving permanent fixes.
Lead and support incident response, major incident management, and post‑incident reviews, ensuring lessons are documented and applied.
• SRE & Operational Excellence
Apply SRE principles to define and monitor SLIs, SLOs, and error budgets for payment systems.
Improve system reliability by reducing manual interventions through automation, scripting, and operational best practices.
Continuously optimize system performance, batch throughput, and transaction processing times on mainframe platforms.
Partner with development teams to shift‑left reliability concerns and improve operability before production releases.
• Mainframe Operations & Payments Processing
Support and maintain mainframe environments (z/OS) handling high-volume payment transactions.
Monitor and support batch and online workloads, including CICS, IMS, DB2, MQ, and JCL-based processing.
Ensure end‑to‑end processing integrity for payment flows, including settlement, reconciliation, and exception handling.
Coordinate with infrastructure, network, and middleware teams during outages or system changes.
• Change, Release & Compliance
Support production releases, system upgrades, and configuration changes with minimal risk to live payment processing.
Ensure compliance with audit, security, and regulatory requirements inherent to banking and payments platforms.
Maintain accurate runbooks, operational procedures, and knowledge base documentation.
Required qualifications to be successful in this role:
• Mainframe & Production Support
Strong hands-on experience in Mainframe Production Support (z/OS environment).
Working knowledge of CICS, DB2, IMS, MQ, and batch processing (JCL).
Experience with performance tuning, abend analysis, dump analysis, and job optimization.
• Monitoring, Reliability & Automation
Experience with enterprise monitoring and alerting tools for mainframe systems (e.g., OMEGAMON or equivalent).
Strong understanding of SRE concepts such as SLIs, SLOs, MTTR, error budgets, and reliability metrics.
Experience in scripting or automation (REXX, shell, Python, or equivalent) to reduce manual operational effort.
• Data & Analysis
Proficiency in SQL for querying DB2 and analyzing production data issues.
Ability to analyze system metrics, logs, and transaction traces to identify bottlenecks and failures.
• Cloud & Modern Tooling (Good to Have)
Basic understanding of cloud concepts (Azure preferred) and hybrid architectures connecting mainframe with distributed systems.
Exposure to Generative AI / Copilot tools for operational efficiency, incident analysis, or knowledge management is a plus.
• Required Domain Knowledge
Strong understanding of payments processing and payment rails such as SWIFT, ACH, Fedwire, RTGS, or equivalent.
Solid banking domain knowledge, including transaction lifecycle, settlements, reconciliations, and regulatory constraints.
Experience supporting mission-critical,…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: