Senior Site Reliability Engineer Job Knutsford area,England UK,IT/Tech

Overview

Senior Site Reliability Engineer — Knutsford (hybrid, 2 days per week in office). A leading Financial Services firm is recruiting for a Senior Site Reliability Engineer to become part of a newly formed Core SRE Team that will establish a Centre of Excellence to enhance and promote SRE best practices.

About the Role

As a key hire, you will raise awareness and drive adoption of SRE methodologies within various teams. This is a hands-on engineering role where you will design, build, and optimise automation frameworks, observability tools, and incident response mechanisms. You will act as a trusted advisor, providing strategic guidance and consultative support to help teams improve reliability, scalability, and efficiency.

Responsibilities

Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning.
Resolution, analysis and response to system outages and disruptions, and implementation of measures to prevent similar incidents from recurring.
Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience.
Monitoring and optimisation of system performance and resource usage, identifying bottlenecks, and implementing best practices for performance tuning.
Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations.

Required Skills

Proficiency in Programming and Scripting — languages such as Python, Powershell, or Go for automating routine tasks and system deployments.
Incident Management and Troubleshooting — ability to manage incidents effectively, troubleshoot issues swiftly, and perform root cause analysis to prevent future incidents.
Systems Engineering and Automation — understanding of operating systems, networking, and cloud infrastructure; proficiency in automation tools for maintaining system reliability at scale.
Influential Communication Skills — ability to communicate effectively with team members and stakeholders to drive alignment and foster a collaborative environment for SRE practices.
Knowledge of Cloud Computing — familiarity with cloud platforms and services as infrastructure moves to the cloud.

Seniority level

Mid-Senior level

Employment type

Full-time

Job function

Information Technology

Industries

Technology, Information and Media
Financial Services

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language