×
Register Here to Apply for Jobs or Post Jobs. X

Manager of System and Platform Operations; RMN

Job in Greater London, London, Greater London, W1B, England, UK
Listing for: Epsilon
Full Time position
Listed on 2026-05-28
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, IT Support, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 60000 - 80000 GBP Yearly GBP 60000.00 80000.00 YEAR
Job Description & How to Apply Below
Position: Manager of System and Platform Operations (RMN)
Location: Greater London

Requirements

  • At least 5 years of experience of hands-on experience in Site Reliability focused positions
  • ,
  • Strong knowledge of containerization technologies (Docker, Kubernetes)
  • ,
  • Experience with infrastructure as code (Terraform)
  • ,
  • Solid understanding of networking, security, and system architecture
  • ,
  • Proficient in scripting languages (Java, Golang, Python, Bash, or similar)
  • ,
  • Experience with monitoring and observability tools (Data Dog, Prometheus, Grafana)
  • ,
  • Knowledge of database management systems (Postgre

    SQL, Bigtable)
  • ,
  • Understanding of API and microservices architecture
  • ,
  • Strong people leadership skills with at least a year in leading and driving high-performance technical teams
  • ,
  • Operations teams within enterprise environments with knowledge of Dev Ops, ITIL, Cloud Services, IT Infrastructure and Operations supporting and maintaining production and development environments and building cloud services that are secure, reliable, scalable and observable
  • ,
  • Experience with establishing Service Delivery strategies that align to new ways of work methods, including Agile
  • ,
  • Experience of establishing and delivering IT support services in a high availability (HA) environment such as 24/7 operations
What the job involves
  • The System and Platform Operations Manager is a technical leadership role that is responsible for the support, reliability and stability of Epsilon Retail Media production systems, environments and offerings
  • ,
  • The team owns the reliability vision for the company, driving continuous improvement through a combination of development and operations initiatives as well as process excellence
  • ,
  • This position and their team has solid-line responsibility for operations including the deployment, management, monitoring, reporting, troubleshooting, and repair of production systems
  • ,
  • Core to the success of the role is to provide a premium customer support experience focused on a “center of excellence” that allows for a full-service delivery support cycle
  • ,
  • This role is responsible for managing the Platform Operation Team centralized within a single geo-region, orchestrating the regional teamwork, serving with both technical and professional support, and championing the company values
  • ,
  • The Platform Operations Engineer works closely with the Engineering team to ensure ongoing system stability and supports the Technical Account Managers from an environment's perspective
  • ,
  • The Platform Operations team is responsible for supporting all retailers once they are live
  • ,
  • Critically important is how this team collaborates and liaises with other teams such as Customer Support, Technical Account Management, Engineering and Customer Success teams
  • ,
  • You'll establish and manage operational practices and ensure we design, implement and operate a support model that is fit for purpose for our future
  • ,
  • Adopt a “Measure Everything” approach to ensure that internal service level objectives and customer service levels agreements are exceeded including executive level reporting on operational health metrics such as SLAs, incident resolution, performance, availability, reliability, capacity etc
  • ,
  • Take ownership of complex issues related to performance, reliability, and scalability and leading resolution of serious incidents and events including communications with customers and wider stakeholders
  • ,
  • Provide insight and expertise on how customers will perceive the changes or impacts to customers to drive customer organization change management and communication
  • ,
  • Empower the Delivery teams to release new products, features, updates and fixes quickly, while ensuring Platforms remain reliable and stable
  • ,
  • Work with the wider Engineering, Product, Delivery and Security teams to ensure that appropriate attention is given to production/system reliability
  • ,
  • Identify the capabilities needed to meet the current and emerging business needs of a significant function
  • ,
  • As subject matter expert on the team, maintain understanding of current technology, database management, reliability practices, and future trends through ongoing education, conference attendance and industry press
#J-18808-Ljbffr
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary