More jobs:
Site Reliability Engineer II
Job in
St. Peters, Saint Peters, St. Charles County, Missouri, 63376, USA
Listed on 2026-06-21
Listing for:
Mission+
Full Time
position Listed on 2026-06-21
Job specializations:
-
IT/Tech
Systems Engineer, SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, IT Support
Job Description & How to Apply Below
Title
Site Reliability Engineer II
About the RoleThe Payment Network Business Operations team is seeking a highly motivated and experienced Site Reliability Engineer II (SRE) to join our team. You will play a critical role in ensuring the reliability, scalability, and performance of our applications, supporting essential services that power Mastercard's global operations. As a thought leader in your field, you will bring technical expertise, a passion for automation, and the ability to mentor.
Responsibilities- Serve as the production readiness steward for Mastercard products, ensuring platform stability and health.
- Support high‑availability systems and maintain operational stability.
- Apply operational design, automation, capacity planning, and monitoring to create fault‑tolerant, scalable products.
- Collaborate with developers to foster ownership and guide them through the application build phase.
- Provide rapid triage, root‑cause analysis, and blameless post‑mortems for issues.
- Engage early in the development lifecycle to be proactive and manage production and change activities.
- Drive risk management and compliance across all environments.
- Provide continuous feedback across the product lifecycle to align operational needs with product priorities.
- Assist in evaluating operational needs and developing technical solutions under guidance.
- Contribute to automation and scripting projects to streamline routine operational tasks.
- Troubleshoot and resolve basic to moderate system issues, escalating more complex problems as needed.
- Document operational procedures and share knowledge with team members.
- Participate in quality checks and reviews to ensure system stability and reliability.
- Manage smaller projects or initiatives as an experienced individual contributor.
- Observability: Implement solutions enabling collection, analysis, and visualization of metrics, logs, and traces for incident detection and continuous improvement.
- Programming and Scripting: Write and maintain code and scripts in Python, Go, Bash, or similar to automate tasks, build tools, and support monitoring, deployment, and incident response.
- Systems and Network Administration: Configure, operate, and troubleshoot Linux/Unix systems and network components respecting networking concepts, protocols, security, and reliability.
- Cloud Computing and Infrastructure: Design, deploy, and manage applications and infrastructure on AWS, Azure, or GCP ensuring scalability, availability, and operational efficiency.
- Reliability and Scalability: Design and operate systems for high availability, fault tolerance, disaster recovery, and scalable performance.
- Dev Ops Practices: Apply CI/CD pipelines, containerization, orchestration, and other Dev Ops principles to enable faster, more reliable software delivery and operations.
- Troubleshooting: Systematically identify, diagnose, and resolve technical issues across systems, applications, and networks using analytic methods and tools.
- Capacity Planning and Performance Optimization: Monitor resource utilization, forecast capacity needs, and optimize system performance for growth and efficiency.
- IT Service Management: Apply ITSM principles for incident, problem, and change management to deliver reliable services and continual improvement.
- Proactive Monitoring and Improvement (SRE Applications): Use reliability signals to anticipate issues, identify risks, and drive preventative improvements.
- Strong knowledge of ITSM practices, observability, and monitoring using Splunk and Dynatrace.
- Experience operating and supporting applications on PCF and AWS platforms.
- Proven ability to implement CI/CD pipelines using Jenkins, Bitbucket, and XLR for automated build and release management.
- Abide by Mastercard’s security policies and practices.
- Ensure the confidentiality and integrity of the information being accessed.
- Report any suspected information security violation or breach.
- Complete all periodic mandatory security trainings in accordance with Mastercard’s guidelines.
Mastercard is a merit‑based, inclusive, equal opportunity employer that considers applicants…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×