Site Reliability Engineer
Job in
Glasgow, Glasgow City Area, G1, Scotland, UK
Listed on 2026-01-29
Listing for:
Paritas Recruitment
Contract
position Listed on 2026-01-29
Job specializations:
-
IT/Tech
Cloud Computing, SRE/Site Reliability, Systems Engineer, Data Engineer
Job Description & How to Apply Below
AWS Site Reliability Engineer (Data Platform) – Contract
Location:
Glasgow
Contract Length:
February 2026 – January 2027
We are recruiting an AWS Site Reliability Engineer (SRE) to support a cloud-native data platform for a major international financial services organisation. The platform is built on AWS, with core components including Snowflake and Databricks, and underpins critical analytics and data services used across the business.
This role focuses on reliability engineering, automation, observability, and resilience. You will work closely with data engineering and platform teams to ensure the platform is scalable, highly available, and operationally robust in a regulated, high-availability environment.
Key Responsibilities- Design, build, and maintain automation for infrastructure provisioning, platform operations, and incident response using Infrastructure as Code (IaC) and CI/CD
- Lead resiliency and disaster recovery (DR) planning, including DR testing, failure scenarios, and recovery validation across AWS and data platform services
- Define and manage SLIs, SLOs, and SLAs for critical data pipelines and platform services, using error budgets to drive reliability improvements
- Build and operate comprehensive observability solutions (metrics, logs, traces, alerting) across AWS, Snowflake, and Databricks workloads
- Partner with data engineering and platform teams to embed reliability-by-design into architecture and delivery
- Perform root cause analysis (RCA) on incidents and drive continuous improvement to reduce operational toil
- Own and drive resolution of incidents and service requests raised by platform consumers, identifying recurring issues and automating fixes to improve reliability and user experience
- Strong practical experience applying Site Reliability Engineering (SRE) principles, including SLO/SLI/SLA design and error budgets
- Proven production experience with AWS (e.g. EC2, S3, IAM, VPC, Cloud Watch)
- Hands‑on experience with automation and Infrastructure as Code (Terraform, Cloud Formation, or CDK)
- Experience building and operating observability and monitoring solutions
- Scripting experience in Python and/or Bash
- Exposure to data platforms such as Snowflake and/or Databricks
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×