×
Register Here to Apply for Jobs or Post Jobs. X

Senior Site Reliability Engineer​/Technical Architect

Job in Winnersh, Wokingham, Berkshire, RG40, England, UK
Listing for: United States Digital Space LLC
Full Time position
Listed on 2026-06-15
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, Cloud Computing: Infrastructure & Operations
Salary/Wage Range or Industry Benchmark: 45000 GBP Yearly GBP 45000.00 YEAR
Job Description & How to Apply Below
Position: Senior Site Reliability Engineer / Technical Architect
Location: Winnersh

We are looking for a highly experienced Senior Site Reliability Engineer / Technical Architect with strong hands‑on expertise in cloud infrastructure, Kubernetes, platform engineering, automation, observability, and AI‑assisted engineering.

The ideal candidate will have deep experience designing, building, and operating reliable, scalable, and secure infrastructure across AWS, Azure, Kubernetes, Terraform, CI/CD, Git Ops, and monitoring platforms. This role requires strong ownership of production systems, incident management, automation, infrastructure standards, and collaboration with engineering, security, and platform teams.

Key Responsibilities

Design, build, and maintain scalable cloud infrastructure across AWS and Azure.

Manage Kubernetes platforms including EKS, AKS, Helm, Argo CD, and Git Ops workflows.

Create reusable Terraform, Ansible, and automation patterns for infrastructure provisioning.

Define and improve SLOs, SLIs, monitoring, alerting, dashboards, and incident response processes.

Implement observability using tools such as Datadog, Grafana, Prometheus, Loki, Tempo, Open Telemetry, Splunk, and related platforms.

Improve platform reliability, reduce operational toil, and support root cause analysis during incidents.

Support secure infrastructure access using IAM, Okta, Teleport, RBAC, MFA, TLS/PKI, Secrets Manager, and cloud security controls.

Work with CI/CD tools such as Jenkins, Git Lab CI, Git Hub Actions, and Argo CD to improve deployment reliability.

Support Linux, Windows Server, Active Directory, DNS, DHCP, LDAP, and Group Policy environments.

Manage large-scale GPU/HPC workloads using SLURM, PySpark, anomaly detection pipelines, and bare‑metal provisioning with IPMI and PXE boot.

Apply AI‑assisted engineering tools such as Cursor, Claude Code, Git Hub Copilot, AWS Bedrock, Ollama, Datadog Watchdog, and Grafana AI Agents to improve automation, troubleshooting, and delivery.

Partner with engineering, security, and business teams to turn operational and regulatory requirements into practical platform standards.

Required Skills

Strong experience in Site Reliability Engineering, Dev Ops, Cloud Infrastructure, or Platform Engineering.

Hands‑on experience with AWS services such as EC2, EKS, ECS, Lambda, RDS, S3, VPC, Cloud Front, Route 53, IAM, KMS, WAF, and Secrets Manager.

Experience with Azure services including AKS, Virtual Machines, Virtual Networks, Storage Accounts, Load Balancer, Azure Monitor, and Entra .

Strong Kubernetes, Docker, Helm, Terraform, Ansible, and Git Ops experience.

Good scripting and automation skills using Python, Bash, or similar languages.

Strong monitoring and observability experience with Datadog, Grafana, Prometheus, Loki, Tempo, Open Telemetry, Splunk, or Nagios.

Experience with incident response, production support, root cause analysis, capacity planning, cost optimisation, and reliability improvement.

Good understanding of networking, DNS, DHCP, LDAP, load balancers, firewalls, CDN, VPN, and security controls.

Experience working in regulated, high‑availability, or large‑scale production environments.

Preferred Certifications

Certified Kubernetes Administrator

AWS Certified Solutions Architect

Red Hat Certified Engineer

Microsoft Certified Solutions Expert

CCNA Routing and Switching / Security

Candidate Profile

This role is suitable for a senior engineer or architect with 15+ years of experience across SRE, cloud, Dev Ops, infrastructure, and platform engineering. The candidate should be comfortable working across both hands‑on technical delivery and architecture‑level decision making, with a strong focus on reliability, automation, security, and developer productivity.

Job Type: Full‑time

Pay: £45,000.00 per year

Benefits
  • Flexitime
Licence/Certification
  • Certified Kubernetes Administrator (required)

Work Location:

In person

#J-18808-Ljbffr
Position Requirements
10+ Years work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary