Platform Operations Engineer

Job in Greater London, London, Greater London, W1B, England, UK

Listing for: Plentific

Full Time position
Listed on 2026-06-01

Job specializations:

Salary/Wage Range or Industry Benchmark: 80000 - 100000 GBP Yearly GBP 80000.00 100000.00 YEAR

Location: Greater London

Requirements

Commercial experience in an infrastructure operations, Dev Ops, SRE or platform support role
Hands‑on experience operating AWS‑based environments
Strong troubleshooting skills across Linux systems, networking, DNS and cloud services
Experience participating in incident response and on‑call rotations
Familiarity with deployment tooling and configuration management systems
Ability to work calmly and methodically during high‑severity incidents
Clear communication skills and a collaborative approach to problem solving

What the job involves

Plentific is looking for a Platform Operations Engineer to ensure the reliability, availability and day‑to‑day operation of its cloud infrastructure and deployment platforms
This role is run and reliability‑focused, owning break‑fix, incident response and operational support while working closely with the Dev Ops Platform team to continuously reduce failure rates and recovery times
Own day‑to‑day operational support of production and non‑production environments across Portal/ESB and SaaS platforms
Act as a primary responder for infrastructure‑related incidents, participating in P1/P2 escalations and coordinated incident response
Perform break‑fix activities including deployment failures, infrastructure outages, configuration errors and data recovery from backups
Maintain and execute operational runbooks and incident playbooks
Support existing automation tooling (e.g. Ansible playbooks, deployment pipelines), escalating design improvements to the Dev Ops Platform team where required
Monitor system health using existing observability tools, responding to alerts and identifying recurring failure patterns
Work with engineering teams to diagnose infrastructure versus application‑level issues and route appropriately
Support security and compliance operational tasks, including access reviews, asset inventories and incident response support
Contribute to post‑incident reviews and continuous improvement initiatives

#J-18808-Ljbffr