Site Reliability Engineer; SRE – Cloud Platforms

Job in Greater London, London, Greater London, W1B, England, UK

Listing for: Talenzon

Full Time position
Listed on 2026-06-04

Job specializations:

IT/Tech
Cloud Computing, SRE/Site Reliability, Systems Engineer, IT Support

Salary/Wage Range or Industry Benchmark: 100000 - 125000 GBP Yearly GBP 100000.00 125000.00 YEAR

Position: Site Reliability Engineer (SRE) – Cloud Platforms
Location: Greater London

Location:

London, UK

Work Model:
On-site

Role Type:
Full-Time

What You’ll Do

Design and implement reliability strategies for high‑availability production systems
Monitor system health, performance, and uptime across cloud infrastructure
Build automation to reduce manual operations and improve system reliability
Develop and maintain observability systems including logging, metrics, and tracing
Manage incident response processes and perform root cause analysis for production issues
Improve system resilience through capacity planning, performance optimisation, and fault tolerance
Collaborate with engineering teams to integrate reliability practices into the software development lifecycle
Implement infrastructure automation using Infrastructure as Code

What We’re Looking For Required Skills & Experience

Strong experience operating production systems in cloud environments such as Amazon Web Services, Google Cloud, or Microsoft Azure
Experience with container orchestration platforms such as Kubernetes
Strong experience with monitoring and observability tools such as Prometheus and Grafana
Proficiency in scripting or programming languages such as Python, Go, or Bash
Experience implementing Infrastructure as Code with tools such as Terraform
Strong understanding of Linux systems, networking, and distributed systems

Nice to Have

Experience with CI/CD pipelines using platforms such as Git Hub Actions or Git Lab
Familiarity with incident management frameworks and reliability engineering practices (SLIs, SLOs, error budgets)
Experience supporting microservices architectures and high-scale systems
Knowledge of distributed tracing and performance monitoring

#J-18808-Ljbffr

Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
View / Apply for Jobs
Matching My Jurisdiction