×
Register Here to Apply for Jobs or Post Jobs. X

Senior Software Engineer​/Reliability Engineering - Data

Job in Greater London, London, Greater London, W1B, England, UK
Listing for: Bloomberg
Full Time position
Listed on 2026-06-18
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 60000 - 80000 GBP Yearly GBP 60000.00 80000.00 YEAR
Job Description & How to Apply Below
Position: Senior Software Engineer / Reliability Engineering - Real-time Data
Location: Greater London

Description & Requirements

Our department is responsible for efficiently distributing financial data from its source to interested users all around the world. This includes (for example) stock prices or foreign exchange rates. Data can either be served in response to a request or streamed in real time.

Location: London

Business Area: Engineering and CTO

#:

The Group Owns
  • The distribution software and infrastructure
  • A range of different sources of data
  • Supporting services to administer and manage the system, including permissioning and metering

The team is also responsible for the Enterprise endpoint (“B-PIPE”), which allows end‑users to programmatically consume data via our SDK. Data is also available through the Bloomberg Terminal and Microsoft Excel.

The main challenge faced by the group is one of scale. Data is sourced from more than 370 global exchanges, with a combined volume in excess of 60 billion messages each day. We deliver this data to hundreds of thousands of terminals and thousands of B‑PIPEs. Handling this volume requires significant infrastructure; we manage multiple clusters in our main data centres, as well as a network of many thousands of servers around the world.

Group

Overview

The RD Reliability Engineering group comprises three sub‑teams located in Tokyo, London, and New York, providing follow‑the‑sun support.

Our mission is to ensure systems are reliable, scalable, and observable through software engineering, while continuously improving how systems behave under load and failure conditions. We work in an outcome‑driven model, focusing on measurable improvements in availability, latency, capacity, and recovery. Our goal is to ensure systems meet defined service level objectives while minimising manual operational effort through automation and software solutions.

London

Team Focus – Availability & Resiliency

The London team plays a key role in ensuring the availability and resiliency of RD infrastructure globally.

We Focus On
  • Detecting and preventing failures across large‑scale distributed systems
  • Ensuring infrastructure demonstrates sufficient capacity and failover capability during site‑loss scenarios
  • Reducing time to detect, diagnose, and recover from incidents
  • Ensuring systems behave predictably under both normal and adverse conditions
What You’ll Do
  • Build and maintain production‑grade software supporting Bloomberg’s global distribution infrastructure
  • Design and implement scalable, fault‑tolerant systems with a focus on observability, performance, and automation
  • Analyse system behaviour under real‑world and failure scenarios to validate capacity, failover, and recovery meet resilience objectives
  • Identify bottlenecks, scaling limits, and reliability risks across distributed systems
  • Improve detection, diagnosis, and prevention of production issues
  • Build tools and frameworks to increase system visibility and reduce time to detect and resolve incidents
  • Automate operational workflows to reduce manual effort and improve system reliability
  • Partner with application and infrastructure teams to improve system design, resilience, and performance
  • Contribute to design discussions, incident reviews, and reliability improvements across the platform
Systems You’ll Work With
  • Configuration systems serving thousands of servers across the global network
  • Service discovery and clustering systems for distributed infrastructure
  • Monitoring and observability frameworks for large‑scale server estates
  • Tooling for diagnosing data quality and distribution issues
  • Ownership of systems may evolve over time as the team focuses on areas of highest impact
What Success Looks Like
  • Systems consistently meet defined reliability, latency, and capacity objectives
  • Issues are detected and mitigated before significant customer impact
  • Systems are demonstrably resilient, with proven failover capability and sufficient capacity under failure conditions
  • Operational processes are automated and scalable
  • Reliability is achieved through engineering improvements rather than manual intervention
What We’re Looking For

We’re not a traditional SRE team. We engineer reliability through software, building solutions that automate operations and improve system…

Position Requirements
10+ Years work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary