×
Register Here to Apply for Jobs or Post Jobs. X

Principal Site Reliability Engineering Expert Director

Job in Greater London, London, Greater London, W1B, England, UK
Listing for: Boston Consulting Group (BCG)
Full Time position
Listed on 2026-05-17
Job specializations:
  • IT/Tech
    Systems Engineer, Cybersecurity, Cloud Computing: Infrastructure & Operations, Network Security
Salary/Wage Range or Industry Benchmark: 100000 - 125000 GBP Yearly GBP 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Location: Greater London

What You’ll Do

The Principal Site Reliability Engineer (SRE) is a senior technical leader responsible for shaping how reliability, automation, and operational excellence are engineered across the organisation. Operating across domains including traditional infrastructure, cloud engineering, network operations, identity, observability, security, AI-driven operations, and automated data workflows, the role focuses on designing scalable systems, reusable engineering patterns, and standardised controls that reduce operational toil, improve resilience, and embed reliability, governance, and compliance directly into delivery pipelines and operational platforms.

This role will drive organisational change towards automation-first, measurable, and repeatable practices. A key part of the role is building and evolving reusable CI/CD and Terraform modules, engineering guardrails, observability patterns, and automation frameworks that can be adopted across multiple teams and domains without requiring each team to solve the same problems independently.

The Principal SRE also plays an important enablement role beyond deeply technical teams, helping less technical areas of the business adopt structured, governed, and scalable ways of working. This includes translating complex engineering practices into practical standards, improving how governance is implemented through engineering controls rather than manual oversight, and driving operational maturity across a broad and diverse technology landscape.

The ideal candidate is a systems thinker who understands how services, networks, identity, data flows, and operational processes fail in real-world conditions, and can apply that understanding to build automation-first, reliability-focused operating models that scale across both technical and non-technical functions.

Key Responsibilities Cross-Domain Reliability Engineering
  • Design and evolve reliability patterns across cloud, network, identity, and security domains.
  • Identify systemic risks and failure modes across platforms and services, and define engineering solutions to mitigate them.
  • Ensure operational activities are embedded into delivery models through automation, CI/CD integration, and event-driven workflows.
Automation & Toil Reduction at Scale
  • Lead the design of automation frameworks that eliminate manual operational tasks across multiple domains.
  • Translate incident learnings and operational inefficiencies into scalable automation and preventative controls.
  • Drive adoption of automation-first principles, reducing dependency on human-driven processes.
  • Contribute to AI-driven operational use cases, including event correlation, anomaly detection, noise reduction, operational insights, and automated remediation.
  • Ensure AIOps capabilities are grounded in reliable telemetry, clear control boundaries, and measurable operational outcomes.
Observability & 24/7 Operational Excellence
  • Define standards for telemetry, monitoring, alerting, and operational visibility across all critical systems.
  • Ensure services are observable, measurable, and support proactive detection of issues.
  • Improve operational readiness, incident response effectiveness, and time-to-recovery through engineering solutions.
CI/CD & Platform Integration
  • Contribute to the design of CI/CD patterns that embed reliability, security, and operational controls into pipelines.
  • Ensure infrastructure, network, identity, and security configurations are managed through code and validated automatically.
  • Support integration of platform services into delivery pipelines to enable consistent, repeatable deployments.
Security & Identity Integration
  • Contribute to secure-by-design patterns, including least privilege, identity-based access, and short-lived credentials.
  • Support integration of security controls (e.g. secrets management, authentication, policy enforcement) into engineering workflows.
  • Ensure security and compliance requirements are met through engineering controls rather than manual processes.
Network & Infrastructure Reliability
  • Support the design of resilient network architectures and segmentation aligned with Zero Trust principles.
  • Ensure network configurations and controls are automated, validated, and observable.
  • Contribute to infrastructure design patterns that improve availability, scalability, and fault tolerance.
  • Design and improve operational patterns for network reliability, segmentation, visibility, and change validation.
  • Support automation and standardisation of network controls and operational procedures to reduce manual intervention and configuration drift.
Technical Leadership & Enablement
  • Provide technical leadership across teams, influencing standards, architecture, and engineering practices.
  • Mentor engineers on reliability engineering, automation, and systems thinking.
  • Drive consistency through reusable patterns, frameworks, and documentation.
Strategic Influence & Continuous Improvement
  • Contribute to reliability engineering strategy and roadmap across the organisation.
  • Communicate…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary