More jobs:
Manager, Software Development & Engineering
Job in
Southlake, Tarrant County, Texas, 76092, USA
Listed on 2026-06-08
Listing for:
Charles Schwab
Full Time
position Listed on 2026-06-08
Job specializations:
-
IT/Tech
IT Support, Systems Engineer, Cybersecurity, Cloud Computing
Job Description & How to Apply Below
* At Schwab, you're empowered to make an impact on your career. Here, innovative thought meets creative problem solving, helping us "challenge the status quo" and transform the finance industry together.
Schwab Technology Services enables the future of how clients manage their money by providing innovative and reliable technology products and services as part of our ongoing commitment to democratize access to investing and financial planning.
This is a senior technical role focused on Site Reliability Engineering for critical enterprise applications and platforms. The role combines hands-on production support, observability, incident prevention, release reliability, automation, operational resilience, and support for compliance and regulatory expectations in enterprise environments.
The position supports high-impact incident response, improves operational standards, mentors onshore and offshore engineers, and communicates clearly with both technical and business stakeholders. It is a strong fit for someone who wants to improve reliability, reduce operational risk, and scale support through automation and better engineering practices.
Key Responsibilities
- Lead production support, operational readiness, and reliability risk management for critical services and dependencies. Manage major incident triage, escalation, recovery, stakeholder communications, and closure activities, including coordination through Remedy or similar enterprise ticketing and incident management tools, with execution aligned to SLAs.
- Work closely with Development and Business Product Owner teams to align reliability priorities, release readiness, and incident communication; identify SLIs, determine SLOs, and plan remediations aligned to business outcomes.
- Improve observability through dashboards, alerting, event correlation, and trend-based early warning. Support release reliability through deployment validation, rollback preparedness, readiness checks, and post-release verification.
- Build and maintain automation using Python, Bash, Windows Batch scripting, and Power Shell to standardize support processes, improve recovery actions, create reusable solutions, and reduce toil through automation.
- Develop automation for monitoring, deployment validation, routine operational tasks, recovery procedures, incident response workflows, and process efficiency improvements.
- Support disaster recovery planning, zonal isolation planning and execution, recovery testing, certificate-related operational needs, and secure production readiness.
- Support compliance and regulatory requirements through disciplined operational controls, documentation, and reliable execution.
- Use Git Hub and other software configuration management tools for source control, collaboration, workflow support, and governance.
- Apply security knowledge and access grouping concepts to support secure operations, platform access controls, and operational readiness.
- Mentor engineers on troubleshooting, automation, SRE and observability disciplines, and cross-time-zone handoffs, and contribute to architecture reviews to improve operability, resilience, and maintainability.
** What you have*
* Required Qualifications
- Strong experience in Site Reliability Engineering, observability, production support, and enterprise platform operations.
- Proven experience managing major incidents, root cause analysis, service account or password restoration, and operational risk reduction in complex production environments with strong SLA-driven execution.
- Strong hands-on experience with Splunk or similar monitoring and observability platforms.
- Strong troubleshooting skills across applications, infrastructure, platforms, networking, databases, storage, and integrated service dependencies.
- Strong scripting and automation skills using Python, Shell/Bash, Windows Batch, and Power Shell to improve operational support, monitoring, deployment validation, recovery procedures, and repetitive task reduction.
- One year of Schwab technology domain experience gained as a current or recent contractor or employee
- Experience building reusable automation solutions that improve consistency, reduce manual effort, and reduce toil through automation.
- Experience with Git Hub and other software configuration management tools. Experience in build and release management, CI/CD practices, deployment controls, and release reliability processes.
- Experience supporting applications on PCF and operating in distributed production environments.
- Working knowledge of resiliency and recovery, including HA patterns, zonal isolation, failover/failback, RTO/RPO, recovery testing, and post-recovery validation, plus provide operational support including the ability to read and
write SQL queries for troubleshooting and data validation.
- Familiarity with Jira and Scrum concepts, along with experience using Remedy or similar enterprise incident and ticket management platforms.
- Understanding of security concepts and…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×