×
Register Here to Apply for Jobs or Post Jobs. X

IB CTO Team - Site Reliability Engineer; SRE - Assistant Vice President

Remote / Online - Candidates ideally in
Cary, Wake County, North Carolina, 27518, USA
Listing for: Dormont Manufacturing Co
Remote/Work from Home position
Listed on 2026-05-30
Job specializations:
  • IT/Tech
    Cloud Computing, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Position: IB CTO Team - Site Reliability Engineer (SRE) - Assistant Vice President

Job Description:

Job Title IB CTO Team - Site Reliability Engineer (SRE)

Corporate Title Assistant Vice President

Location Cary, NC

Who we are

In short – an essential part of Deutsche Bank’s technology solution, developing applications for key business areas.

Our Technologists drive Cloud, Cyber and business technology strategy while transforming it within a robust, hands‑on engineering culture. Learning is a key element of our people strategy, and we have a variety of options for you to develop professionally. Our approach to the future of work champions flexibility and is rooted in the understanding that there have been dramatic shifts in the ways we work.

Having first established a presence in the Americas in the 19th century, Deutsche Bank opened its US technology center in Cary, North Carolina in 2009. Learn more about us here.

Overview

We are looking for a Site Reliability Engineer (SRE) to join our global team. This role will focus on ensuring the operational health, reliability, performance, and scalability of the CARE platform and multi‑tenant applications, encompassing Global Control Programme(GCP)/on‑prem infrastructure, application deployment, and the underlying CARE services. You will be instrumental in defining and implementing SRE best practices to maintain a highly available and resilient platform.

As a senior IB SRE, you will be crucial in ensuring the continuous operation and improvement of the platform.

What We Offer You

  • A diverse and inclusive environment that embraces change, innovation, and collaboration

  • A hybrid working model, allowing for in‑office / work from home flexibility, generous vacation, personal and volunteer days

  • Employee Resource Groups support an inclusive workplace for everyone and promote community engagement

  • Competitive compensation packages including health and wellbeing benefits, retirement savings plans, parental leave, and family building benefits

  • Educational resources, matching gift and volunteer programs

What You’ll Do

  • Platform Reliability and Performance:
    Proactively monitor, troubleshoot, and resolve issues related to platform availability, performance, and capacity on both GCP and on‑prem infrastructure

  • Operational Excellence:
    Develop, implement, and maintain SRE best practices, including incident response, post‑mortems, root cause analysis, and proactive problem prevention

  • Automation and Tooling:
    Drive automation efforts to reduce manual toil across operational tasks, deployment, scaling, and recovery. This includes developing and improving monitoring, alerting, and self‑healing systems

  • SLI/SLO Management:
    Define, monitor, and report on Service Level Objectives (SLOs) and Service Level Indicators (SLIs) for key platform services, working to continuously improve them

  • Collaboration and Support:
    Liaise with application teams (tenants) to understand their operational needs, provide guidance on platform best practices for reliability, capacity planning, and assist with complex troubleshooting

  • Security and Compliance:
    Collaborate with security teams to ensure the platform adheres to security policies and compliance requirements, focusing on operational security aspects

Skills You’ll Need

  • Strong understanding of SRE principles and practices, including SLOs/SLIs, incident management, post‑mortems, and toil reduction

  • Deep understanding of GCP services such as GKE, Identity and Access Management or Illiquid Asset Monitization (IAM), identity services, Cloud

    SQL, Cloud Monitoring, Cloud Logging, and related operational aspects. Extensive experience with Kubernetes and container orchestration, including configuration, troubleshooting, and performance tuning. Experience with Service Mesh (e.g., Istio) is highly desirable

  • Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Splunk, Google Cloud Monitoring) and defining effective alerts and dashboards

  • Solid experience with Git and Git Hub, including Git workflow for managing code and deployment tooling such as ArgoCD for deployments and managing application life cycles

  • Programming/scripting (e.g., Python, Go, Java, Bash) and Infrastructure as Code (e.g. Terraform) experience for automation, tooling development,…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary