×
Register Here to Apply for Jobs or Post Jobs. X

Lead Cloud Site Reliability Engineer

Job in Leeds, West Yorkshire, ME17, England, UK
Listing for: lloyds banking group
Full Time position
Listed on 2026-02-17
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, Cloud Computing, IT Support
Salary/Wage Range or Industry Benchmark: 92701 - 109060 GBP Yearly GBP 92701.00 109060.00 YEAR
Job Description & How to Apply Below

End Date Tuesday 10 February 2026

Salary Range £92,701 - £109,060

Flexible Working Options Hybrid Working, Job Share

Job Description Summary

We support flexible working –  for more information on flexible working options

Job Description

Lead Site Reliability Engineer – Public Cloud Platform
Location: Halifax, Leeds or Manchester

Salary: £90,440- £106,400

Working Pattern: Hybrid (2 days in office per week)

About the Opportunity

At Lloyds Banking Group, our purpose is to Help Britain Prosper. As we continue redefining into a modern, innovative, purposeful organisation, we’re investing heavily in cloud, automation and engineering excellence across our platforms.

We’re looking for a Lead Site Reliability Engineer (SRE) to join our Public Cloud Platform
, supporting both GCP and Azure
. In this role you’ll help strengthen observability, reliability, and operational excellence across our cloud estate—enabling our ambition to become the UK’s leading Fin Tech.

You’ll work closely with Product Owners and Engineering Leads to embed SRE principles, lead a team of up to 15 SREs, and champion a culture of learning, automation and continuous improvement.

What You’ll Be Doing
  • Lead, coach and develop a high‑performing SRE team, fostering autonomy, inclusion and continuous improvement.
  • Partner with Product Owners and Engineering Leads to embed reliability into roadmaps, backlogs and delivery decisions.
  • Apply SRE principles (SLIs, SLOs, error budgets) to ensure our services remain highly reliable, performant and scalable.
  • Drive improvements in observability—across metrics, logs, traces and events—ensuring services are observable by design.
  • Use Dynatrace as the primary observability platform for significant dashboards and customer‑centric alerting.
  • Own Infrastructure‑as‑Code and CI/CD‑based environments, implementing enhancements and responding to operational change.
  • Lead coordination of incident response and root cause analysis, supporting teams through major incidents, post‑incident reviews and prevention of recurrence.
  • Collaborate with multi‑disciplinary engineering teams to remove technical impediments, reduce toil and improve service operability.
  • Contribute hands‑on engineering where needed, validating technical decisions and guiding best practice.
  • Bring an approach of curiosity, experimentation, and first‑principles thinking to evolve our engineering culture.
What You’ll Bring Essential Skills & Experience
  • Proven experience applying SRE practices within Azure, GCP, or both.
  • Strong understanding of SLIs, SLOs, error budgets
    , and how to use these to guide product and engineering decisions.
  • Experience ensuring reliability of production services, including availability, performance and recoverability.
  • Hands‑on or leadership experience in incident and problem management
    , focused on reducing MTTR and avoiding repeat issues.
  • Background in software engineering or cloud engineering, with good understanding of modern SDLC practices.
  • Practical experience with Dev Ops, CI/CD and automation to improve service reliability.
  • Experience improving observability on complex, distributed systems.
  • Ability to use data to influence prioritisation and balance reliability with feature delivery.
  • Collaboration and communication skills, working effectively with product, engineering and platform teams.
  • Experience mentoring engineers and promoting inclusive, supportive team culture.
Desirable Skills
  • Certifications or strong experience with Google Cloud Platform and/or Microsoft Azure
    .
  • Knowledge of Kubernetes, compute services, API management and large‑scale distributed systems.
  • Experience with Terraform
    , Jenkins
    , or equivalent configuration/pipeline tooling.
  • Ability to write and maintain scripts or code in languages such as Python, Bash, Power Shell or Groovy.
  • Solid grasp of cloud networking, security, and systems built around APIs.
  • Experience with Infrastructure as Code, modular design and scalable automation patterns.
About You

You’re someone who:

  • Is passionate about building resilient, observable, customer‑focused platforms.
  • Enjoys coaching others, sharing knowledge and shaping engineering culture.
  • Looks for opportunities to remove toil and introduce automation.
  • Thrives in…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary