Platform Operations Manager; DevOps & Site Reliability Engineering
Listed on 2026-06-16
-
IT/Tech
SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Systems Engineer, IT Project Manager
Location: Greater London
Location: Hybrid - working from our Canary Wharf office 2-3 times per week
Reports to: Director of Engineering
Salary: Up to £75,000
Manages: Platform Operations & Dev Ops Team (UK & India)
Join Us at The Centre for ADHD Research and Excellence:Shaping the Future of Accessible Healthcare
At Care ADHD, our mission is to transform ADHD care through innovation, data, and technology — delivering accessible, patient‑centred services that improve outcomes for individuals, clinicians, and healthcare providers.
We believe that high‑quality data and meaningful insight are essential to improving clinical services, understanding patient journeys, and ensuring that care is delivered efficiently and effectively.
The RoleWe are looking for an experienced and hands‑on Platform Operations Lead to own the reliability, availability, performance, and operational stability of Care ADHDʼs technology platforms. This role combines Dev Ops, Site Reliability Engineering (SRE), cloud infrastructure, platform operations, and technical leadership — ensuring that our systems are securely deployed, highly available, scalable, and operational 24/7/365. You will lead platform operations across both the UK and India, working closely with engineering, QA, security, and product teams to ensure our infrastructure and deployment capabilities support a fast‑moving and high‑quality engineering organisation.
This is a highly technical leadership role requiring someone who is equally comfortable defining operational strategy, improving engineering practices, and being hands‑on with cloud infrastructure, automation, monitoring, incident response, and reliability engineering.
Key Responsibilities
Platform Reliability & Operations
- Own the operational health, availability, and reliability of all production and non‑production environments
- Ensure platforms are monitored, maintained, and operational 24/7/365
- Lead platform incident management, root cause analysis, and service recovery processes
- Establish and improve operational readiness, resilience, and disaster recovery capabilities
- Define and manage SLAs, SLOs, and operational performance metrics
- Ensure high levels of platform uptime, stability, scalability, and security
- Design, build, and maintain cloud infrastructure primarily within AWS
- Lead infrastructure automation and Infrastructure as Code initiatives using Terraform or AWS CDK
- Design and optimise CI/CD pipelines to support efficient, secure, and reliable software delivery
- Improve deployment automation, release management, and environment consistency
- Support engineering teams with platform tooling, deployment strategies, and operational best practices
- Drive improvements in deployment reliability, infrastructure scalability, platform security, cost optimisation, and operational efficiency
- Implement and maintain observability solutions including monitoring, logging, alerting, and tracing
- Develop proactive approaches to incident prevention and operational resilience
- Lead reliability engineering practices including capacity planning, performance monitoring, fault tolerance, and high availability design
- Reduce operational toil through automation and self‑service tooling
- Establish strong incident response and post‑incident review processes
- Lead and mentor platform operations and Dev Ops engineers across the UK and India
- Build a collaborative, accountable, and high‑performing operational culture
- Allocate and coordinate operational resources across projects and platform priorities
- Work closely with the Director of Engineering to align platform strategy with product and engineering delivery goals
- Collaborate with engineering leads, QA, security, and product teams to support platform and release readiness
- Ensure infrastructure and operational processes follow security best practices
- Support compliance with GDPR and healthcare‑related operational standards
- Help implement operational governance, access controls, and infrastructure security policies
- Work closely with security and engineering teams to manage…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: