Site Reliability Engineer Job Manchester area,England UK,IT/Tech

The Site Reliability Engineer builds out solutions to support Platform disaster response/crisis management activities in compliance with the Engineering and Customer requirements and helps provide and coordinate disaster preparedness with respect to the organization’s Platform, helping ensure business continuity.

They also ensure we have enough resources to meet current and future Platform demand efficiently, involving forecasting needs, capacity planning, monitoring performance (KPIs), managing risks (shortages/overloads), and developing strategies for optimisation.

Your impact:

Work with Engineering & Service Management to ensure that the disaster recovery and Capacity plans drive disaster recovery (DR) strategy and procedures both in Cloud and DC venues.
Build out tooling that supports the DR plans and tracks progress and maturity against set KPI’s and Metrics.
Work with Engineering & Service Management to ensure that disaster recovery solutions are adequate, in place, maintained, and tested as part of the regular operational life cycle.
Provide ongoing feedback for risk management, mitigation, and prevention.
Develop and implement capacity planning tooling, frameworks, policies, and strategies.
Provide capacity requirements and impact assessments for new services or changes.
Collaborate with other Platform managers to deliver objectives on our platform evolution roadmap.

Your Skills

Experience of Linux administration
Experience or strong understanding of Kubernetes
Being comfortable in a scripting language suitable for automation tasks
Understanding of current recovery solutions and high availability architectures for cloud and on prem
Understanding of Capacity Management & Planning scenarios and tooling
Experience with Agile principles and practices
Expertise in problem diagnosis across complex, distributed systems
Experience supporting SaaS products
Experience using AI for Automation
Experience with Incident Management, Post Mortems and related practices
Knowledge of observability and monitoring best practices
Experience operating within one or more public clouds (AWS, GCP, Azure)
Experience with configuration management, and infrastructure as code

As set forth in Anaplan’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, perform essential job functions, and receive equitable benefits and all privileges of employment. Please contact us to request accommodation.

#J-18808-Ljbffr