Senior SRE/DevOps Engineer; Data Storage
Senior SRE/Dev Ops Engineer role sits within Arm's Global Storage team, supporting large-scale storage platforms used by engineering and HPC workloads across on-premises and cloud environments. This role focuses on making storage platforms reliable, observable, and easier to operate. It includes reducing manual work through automation, building practical tooling, and helping teams use storage services effectively king with colleagues across multiple regions, the role contributes to resolving issues, addressing root causes, and maintaining stable, well-performing systems that support Arm's technology development.
Responsibilities- Maintain the reliability, availability, and performance of storage platforms used by engineering teams.
- Contribute to incident response, investigation, and problem resolution.
- Apply service reliability measures such as SLOs and SLIs where appropriate.
- Build and maintain infrastructure using Terraform and Ansible.
- Develop automation and Python-based tools to support operations and system insight.
- Use AI-based tooling to assist with monitoring, anomaly detection, and analysis.
- Develop simple agent-based workflows to support operational decision-making.
- Enhance monitoring and alerting to provide clear visibility of system behaviour.
- Work with engineering and security teams to maintain secure and well-managed systems.
- Maintain accurate documentation and share knowledge across the team.
- Experience working with production systems using Dev Ops or similar engineering practices.
- Experience with Infrastructure as Code tools such as Terraform or configuration tools such as Ansible.
- Ability to develop automation or tooling using a programming language such as Python.
- Experience supporting reliable and scalable systems in an operational environment.
- Experience with large-scale storage platforms (file or object) or HPC environments.
- Familiarity with AWS, GCP, or Azure.
- Exposure to CI/CD or Git-based workflows.
- Experience using or integrating AI/ML or agent-based tooling in operations.
- Understanding of identity, access control, and security practices.
- Experience with platforms such as LakeFS.
- Awareness of service management approaches (e.g. ITIL).
Arm is an equal opportunity employer, committed to providing an environment of mutual respect where equal opportunities are available to all applicants and colleagues. We are a diverse organization of dedicated and innovative individuals, and don't discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.
BenefitsSalary Range: £73,500 - £99,500 per year
#J-18808-LjbffrTo Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: