Senior Cloud Operations Engineer
Job in
Ruddington, Nottingham, Nottinghamshire, NG1, England, UK
Listed on 2026-02-08
Listing for:
MHR
Full Time
position Listed on 2026-02-08
Job specializations:
-
IT/Tech
SRE/Site Reliability, Systems Engineer, Cloud Computing, IT Support
Job Description & How to Apply Below
Your Team
As part of the Cloud Operations team, you will play a vital role in supporting the People First SaaS platform, a modern, microservices-based HR and payroll solution built in Azure and delivered to hundreds of customers.
Your ImpactAs a Senior Site Reliability Engineer, you will help ensure the reliability, scalability and automation of MHR’s People First platform through effective cloud operations, observability and continuous improvement. You will apply SRE principles to build resilient systems and strengthen operational excellence in Azure.
In this role you will:
- Build, deploy and maintain cloud environments through automated processes, ensuring consistent, reliable and scalable platform operations.
- Implement and optimise monitoring, alerting and diagnostics, using observability data to support SLIs/SLOs, reduce MTTR and improve service reliability.
- Collaborate with Platform and Development teams to ensure systems are designed for operability, resilience, performance and effective capacity management.
- Automate provisioning, scaling and configuration management using scripting and IaC tooling to minimise toil and improve repeatability.
- Contribute to incident response and root cause analysis, document and evolve operational standards, and participate in the on-call rota to support platform availability.
- Experience with Dev Ops and Continuous Delivery practices and how they apply to reliable service operation.
- Experience working with backend services built in Java, .NET or similar languages and how these architectures influence deployments, change management and rollback safety.
- Experience implementing logging, metrics and tracing for backend services using tools such as Dynatrace, Azure Monitor, Application Insights or Grafana to inform SLIs/SLOs and reduce MTTR.
- Strong understanding of IaC principles and experience delivering consistent, auditable environments using Terraform and ideally Bicep.
- Handson experience operating cloud hosted SaaS platforms in Microsoft Azure with a focus on resilience, autoscaling, fault tolerance and operational readiness.
- Ability to automate workflows across build, deploy, configuration, drift correction and resilience tasks using Power Shell, Terraform or similar scripting.
- Experience supporting incident response, performing root cause analysis, contributing to post incident reviews and implementing preventive measures across backend services.
- Experience designing or maintaining CI/CD pipelines for Java, .NET or similar codebases, including quality gates, test automation, performance checks and release observability.
Position Requirements
10+ Years
work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×