Site Reliability Engineer; SRE - Application Support
City Of London, Central London, Greater London, England, UK
Listed on 2025-11-13
-
IT/Tech
Cloud Computing, Systems Engineer, IT Support, SRE/Site Reliability
Location: City Of London
Site Reliability Engineer (SRE) - Application Support
5 days ago Be among the first 25 applicants
AboutStep forward into the future of technology with ZILO™.
We're here to redefine what's possible in technology. While we're trusted by the global Transfer Agency sector, our technology is truly flexible and designed to transform any business 've created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can't match.
At ZILO™ our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore new ideas with a curious mind, and set a high standard in every detail.
We are a team of dedicated professionals where everyone, regardless of their role, drives our progress and creates real impact. If you're ready to shape the future, let's talk.
Job DescriptionWe're looking for a Site Reliability Engineer to join our SRE team — someone who thrives on solving complex production issues, understands how applications behave in the real world, and takes pride in keeping systems reliable and performant. This is not a platform engineering role. You won't just be spinning up Kubernetes clusters or building infrastructure — you'll be deeply involved in understanding our applications, what they do and how they operate, troubleshooting real-world issues, and working directly on improvements that impact our customers every day.
WhatYou'll Do
- Incident Response & Troubleshooting:
Investigate and resolve incidents raised by clients, diving into logs, metrics, and application code to identify root causes. - Application Debugging:
Work across our core stack – Java, Golang, and Python – to trace and fix issues affecting reliability or performance. - Data Fixes:
Perform data investigation and fixes using Postgres. - Operational Excellence:
Patch and maintain Kubernetes clusters and other production systems. - SRE
Roadmap:
Contribute to the continuous improvement of our observability, reliability, and automation initiatives.
This role is hybrid and will require regular weekly attendance at our London office.
Requirements- Solid experience with application debugging in at least one of:
Java, Golang, or Python. - A good grasp of Postgre
SQL – enough to run queries, analyse data, and perform safe fixes. - Familiarity with Kubernetes and modern cloud platforms (AWS, GCP, or Azure).
- Understanding of incident management, observability tools (Grafana, Prometheus, etc.)
- A mindset focused on reliability, quality, and ownership.
- Enhanced leave – 38 days inclusive of 8 UK Public Holidays
- Private Health Care including family cover
- Life Assurance – 5x salary
- Flexible working – work from home and/or in our London Office
- Employee Assistance Program
- Company Pension (Salary Sacrifice options available)
- Access to training and development
- Buy and Sell holiday scheme
- The opportunity for “work from anywhere/global mobility”
Associate
Employment typeFull-time
Job functionOther
Referrals increase your chances of interviewing at ZILO™ by 2x
#J-18808-LjbffrTo Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: