Platform Reliability Engineer
Job in
London, Greater London, W1B, England, UK
Listed on 2026-03-04
Listing for:
Ncounter Limited
Full Time
position Listed on 2026-03-04
Job specializations:
-
IT/Tech
SRE/Site Reliability, Systems Engineer, Cloud Computing, Network Engineer
Job Description & How to Apply Below
Platform Reliability Engineer
Ncounter is supporting a highly sophisticated, technology driven trading environment in the search for a Platform Reliability Engineer to help operate, engineer, and continuously improve a large scale distributed production platform used by researchers and software engineers. This role sits at the intersection of software engineering, infrastructure engineering, and production operations, with a strong focus on reliability, automation, observability, and operational excellence across mission critical systems.
You will work closely with developers and infrastructure teams to maintain resilient services, diagnose complex production issues, and engineer tooling and automation that reduces operational toil while improving platform stability and performance.
Key Responsibilities
• Improve reliability and resilience of production platform services
• Build automation and internal tooling to streamline operational workflows
• Design observability across metrics, logging, tracing, and alerting
• Diagnose complex production issues and improve system performance
• Contribute to operational runbooks, incident reviews, and reliability standards
Experience Required
• Background in SRE, Production Engineering, or platform operations supporting large scale systems
• Strong Linux troubleshooting experience across distributed or containerised environments
• Programming capability in Python with Git based workflows and CI/CD pipelines
• Hands on experience with observability platforms and monitoring systems
• Experience operating high availability infrastructure and improving system resilience
Exposure to technologies such as Kubernetes, Prometheus, Grafana, ELK, Kafka, Postgre
SQL, Redis, Terraform, or Ansible would be beneficial.
If you enjoy solving complex reliability challenges and building the tooling that keeps large scale platforms operating smoothly, we would welcome a conversation
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×