×
Register Here to Apply for Jobs or Post Jobs. X

Sr Service Reliability Engineer – Kings Cross

Job in London, Greater London, EC1A, England, UK
Listing for: Universal Music Group
Full Time position
Listed on 2026-01-08
Job specializations:
  • IT/Tech
    Systems Engineer
Salary/Wage Range or Industry Benchmark: 100000 - 125000 GBP Yearly GBP 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Music is Universal     It’s the passionate and dedicated team at Universal Music who help make us the world’s leading music company. From A&R to finance, legal to digital, sales to marketing, Universal Music is the place to grow and develop your career within a truly commercial and innovative business that leads in everything it does.

Everyone is welcome to apply for our roles, and we are determined to ensure that no applicant or employee receives less favourable treatment because of gender, race, disability, sexual orientation, religion, belief, age, marital status, background, pregnancy, or caring responsibilities. We also recognise the importance of diversity of thought within our teams and are fully committed to embracing the talents of people with autism, dyslexia, ADHD, and other forms of neurocognitive variation.

We will always seek to make appropriate adjustments to recruitment, workplaces, and work processes to be fully inclusive to people with different needs and working styles. If you need us to make any reasonable adjustments for you from application onwards, including alternatives to the online form or to disclose a neurocognitive condition, please email Uni
*
* Job Summary:

** We are UMG, the Universal Music Group. We are the world’s leading music company. In everything we do, we are committed to artistry, innovation and entrepreneurship. We own and operate a broad array of businesses engaged in recorded music, music publishing, merchandising, and audiovisual content in more than 60 countries. We identify and develop recording artists and songwriters, and we produce, distribute and promote the most critically acclaimed and commercially successful music to delight and entertain fans around the world.

As a key member of our Global Technical Operations team, you will be the ultimate escalation point and subject matter expert for all SRE operations. This is a senior technical role that requires a strategic mindset, deep-seated expertise in System Reliability Engineering. By blending a software engineering mindset with operational expertise, you will engineer solutions that improve system reliability, automate complex processes, and reduce manual toil.

You will not only resolve the most challenging technical issues but also drive the operational strategy for SRE implementation  a Site Reliability Engineer, you won't just be supporting systems; you'll be ensuring the services that connect artists and fans around the globe are always on.
*
* Job Functions:

*
* Key Responsibilities:

* System Reliability & Performance:* - Design, build, and maintain the availability, scalability, and performance of critical services.* - Develop and maintain robust monitoring, alerting, and observability systems (e.g., using AWS Cloud Watch, Dynatrace) to ensure rapid issue detection and resolution.* - Monitor infrastructure capacity and performance, providing analysis and suggestions for service delivery improvement.
* Automation & Efficiency:* - Drive the automation of repetitive operational tasks, including infrastructure provisioning, deployments, and scaling.* - Create and maintain scripts and custom code to support and enhance our operational toolset.* - Support and optimize CI/CD pipelines to improve deployment speed and reliability.
* Incident Management &

Collaboration:

* - Participate in an on-call rotation to troubleshoot and mitigate production incidents.* - Lead post-incident reviews and root cause analyses to implement lasting solutions.* - Partner with engineering and IT stakeholders to embed SRE best practices (SLOs, error budgets) into the design and development lifecycle.
* Act as the Final Escalation Point for SRE operations:
Participate in resolving the most complex and critical incidents, which other teams have been unable to solve. Provide leadership during high-severity events, coordinating cross-functional teams to ensure rapid and effective resolution.
* Develop Escalation Frameworks:
Design, implement, and refine the escalation management process for the entire Global Technical Operations Center, ensuring that incidents are triaged, documented, and resolved efficiently.
* Strategic…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary