×
Register Here to Apply for Jobs or Post Jobs. X

Principal Application Support Engineer; SRE Lead

Job in Coppell, Dallas County, Texas, 75019, USA
Listing for: DTCC
Full Time position
Listed on 2026-03-01
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, Cybersecurity, IT Support
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Position: Principal Application Support Engineer (SRE Lead)

Are you ready to make an impact at DTCC?
Do you want to work on innovative projects, collaborate with a dynamic and supportive team, and receive investment in your professional development? At DTCC, we are at the forefront of innovation in the financial markets. We are committed to helping our employees grow and succeed. We believe that you have the skills and drive to make a real impact. We foster a thriving internal community and are committed to creating a workplace that looks like the world that we serve.
The Information Technology group delivers secure, reliable technology solutions that enable DTCC to be the trusted infrastructure of the global capital markets. The team delivers high‑quality information through activities that include development of essential, building infrastructure capabilities to meet client needs and implementing data standards and governance.

Pay and Benefits
  • Competitive compensation, including base pay and annual incentive
  • Comprehensive health and life insurance and well‑being benefits, based on location
  • Pension / Retirement benefits
  • Paid Time Off and Personal/Family Care, and other leaves of absence when needed to support your physical, financial, and emotional well‑being.
  • DTCC offers a flexible/hybrid model of 3 days onsite and 2 days remote (onsite Tuesdays, Wednesdays and a third day unique to each team or employee).
The Impact You Will Have in This Role

The Enterprise Application Support (EAS) organization provides critical application support across the ITP and ECS lines of business, ensuring enterprise platforms are reliable, scalable, and operationally resilient.

Your Primary Responsibilities
  • Agile & Delivery Engagement
    • Participate in planning, design, and sprint zero activities to ensure reliability, observability, resiliency, and operational readiness are embedded early in the SDLC
    • Partner with delivery teams to champion non‑functional requirements (NFRs) from design through production
  • System Reliability & Architecture
    • Drive the design and evolution of reliable, resilient, and scalable system architectures
    • Influence redundancy, fault tolerance, and disaster recovery strategies
    • Provide design recommendations that enable automated recovery and minimize manual intervention
    • Develop and maintain application recovery runbooks to improve recovery consistency and reduce downtime
  • Monitoring, Alerting & Observability
    • Design and implement comprehensive monitoring and observability solutions.
    • Define actionable alerts and establish Service Level Indicators (SLIs) and Service Level Objectives (SLOs).
    • Proactively identify and mitigate potential issues before they impact users.
  • Incident Management & Root Cause Analysis
    • Serve as incident commander during critical system outages, coordinating cross‑functional response and driving timely resolution.
    • Lead post‑incident reviews and root cause analyses, ensuring corrective actions prevent recurrence.
    • Continuously improve incident response processes and MTTR.
  • Automation & Tooling
    • Develop and maintain automation to streamline operational tasks, including:
    • Self‑healing mechanisms
    • Application deployments
    • Scaling strategies
    • Infrastructure and operational workflows
  • Development & Cross ‑ Functional Collaboration
    • Work closely with development teams to integrate SRE practices into the SDLC.
    • Promote reliability, observability, and operational excellence from design through production.
    • Collaborate with infrastructure, network, security, Scrum Masters, and internal/external stakeholders.
  • Security & Risk Integration
    • Partner with security teams to ensure systems are resilient against cyber threats.
    • Incorporate security best practices into operational and reliability designs.
    • Collaborate with IT Embedded Risk Managers to identify and remediate operational and reliability risks.
  • Operational Readiness
    • Lead operational readiness reviews with EAS L2 support teams at key project milestones.
    • Identify operational risks and gaps; validate NFRs in UAT environments to ensure production readiness.
  • Capacity & Performance Management
    • Proactively assess capacity needs and plan for future growth.
    • Implement scaling strategies to support high‑load and peak usage scenarios.
    • Analyze performance metrics,…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary