Site Reliability Engineer

Job in Greater London, London, Greater London, W1B, England, UK

Listing for: Huxley

Full Time position
Listed on 2026-06-15

Job specializations:

IT/Tech
SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Systems Engineer, IT Support

Salary/Wage Range or Industry Benchmark: 100000 - 125000 GBP Yearly GBP 100000.00 125000.00 YEAR

Location: Greater London

Site Reliability Engineer (Cloud & Automation) - London - 2 Days on Site per week.

A leading global financial services organisation is seeking a Site Reliability Engineer (SRE) to drive reliability, automation, and performance across its cloud-hosted platforms.

The Opportunity

This role sits within a high-performing Platform Operations function, acting as a central point of expertise for SRE methodologies and automation. You will play a key role in improving system resilience, scalability, and operational excellence across a complex, regulated environment.

Key Responsibilities

Lead the implementation of SRE best practices across cloud infrastructure
Drive improvements in observability, alerting, and capacity planning (SLA / SLO / SLI)
Identify and reduce operational toil through automation and remediation frameworks
Build and enhance Git Ops and Infrastructure-as-Code capabilities (e.g. Terraform, Ansible)
Develop and review production‑grade code to support automation initiatives
Support incident management and on‑call processes, ensuring production stability
Contribute to post‑incident reviews, embedding SRE principles to reduce risk

Requirements

Demonstrable experience in SRE or infrastructure operations within cloud environments (AWS / GCP)
Strong scripting skills (Python, Ansible, or Power Shell)
Experience with Infrastructure as Code and Git Ops methodologies
Hands‑on knowledge of observability / APM tools (e.g. Grafana, Datadog, Dynatrace)
Proven experience managing incidents, root cause analysis, and on‑call support
Understanding of SLA/SLO/SLI frameworks and reliability engineering principles

Desirable

Background in software development
Experience working within regulated financial services environments
Familiarity with ITIL and enterprise service management frameworks
Relevant certifications (e.g. AWS, Terraform)

Why Apply

Opportunity to shape cloud reliability strategy in a large‑scale environment
Work with modern tooling across automation, Dev Ops, and SRE practices
Strong emphasis on engineering excellence and continuous improvement
Competitive compensation and long‑term career progression

#J-18808-Ljbffr