Devops Tech Lead
Listed on 2026-02-12
-
IT/Tech
Systems Engineer, Cloud Computing, SRE/Site Reliability, IT Project Manager
At Frasers Group we’re rethinking retail. Through digital innovation and unique store experiences, we’re serving our consumers with the world’s best sports, premium and luxury brands globally. As a leader in the industry, we’re elevating the retail experience for our consumers through our collection of established brands, including Sports Direct, FLANNELS, USC, Frasers, and GAME.
Our vision- we are building the worlds most admired and compelling brand ecosystem
Our purpose – we are elevating the lives of the many with access to the world’s best brands and experiences
At Frasers Group, we fear less and do more. Our people are forward thinkers who are driven to operate outside of their comfort zone to change the future of retail, embracing challenges along the way. The potential to elevate your career is massive, the experience unrivalled. To be able to make the most of it you need to live and breathe our principles:
- Think without limits - Think fast, think fearlessly, and take the team with you
- Own it and back yourself - Own the basics, own your role and own the results
- Be relevant - Relevant to our people, our partners and the planet
Are you ready to join the Fearless?
Job DescriptionAre you passionate about building resilient cloud infrastructure, championing Dev Ops culture, and leading high-performing teams? We’re looking for a Devops Tech Lead to join our dynamic Digital Engineering team and lead the development, operations, and continuous improvement of core platform services, cloud infrastructure, and Dev Ops practices.
This is a unique opportunity to shape the technical foundation that powers over 50 websites and 10 mobile applications for some of the UK’s most iconic retail brands.
As a Devops Tech Lead, you’ll drive operational excellence, reliability, and scalability across our platforms. You’ll guide a squad focused on cloud infrastructure, observability, incident response, and automation—blending technical hands‑on work (approximately 30% of your time) with team leadership and strategic thinking.
Key Responsibilities- Lead a high‑performing Dev Ops/Platform squad responsible for infrastructure‑as‑code, observability, CI/CD, and cloud operations.
- Drive the design and implementation of scalable, secure, and highly available cloud environments (primarily Azure) using Terraform and modern Dev Ops tools.
- Spend approximately 30% of your time hands‑on, contributing to IaC, automation scripts, monitoring setups, and critical system improvements.
- Guide your team through effective incident management, ensuring rapid resolution, root cause analysis, and continuous improvement of system reliability.
- Champion SRE best practices including error budgeting, SLAs/SLOs/SLIs, and proactive reliability engineering.
- Coach and mentor engineers through pairing, reviews, and architectural discussions.
- Collaborate with product and engineering stakeholders to define platform roadmaps and priorities.
- Continuously evolve our CI/CD pipelines, Git Hub workflows, release strategies, and operational playbooks.
- Foster a Dev Ops culture with a strong focus on automation, security, observability, and accountability.
- Ensure effective Agile ceremonies and delivery practices are followed, adapting where needed for high‑performing operations‑focused teams.
- Proven experience leading Dev Ops or SRE teams, with a strong focus on infrastructure, automation, and operational excellence.
- Deep expertise in cloud platforms (preferably Azure), including identity, networking, cost management, and resource provisioning.
- Strong experience with Terraform and Infrastructure‑as‑Code principles in production environments.
- Practical knowledge of container orchestration (Docker, Kubernetes, AKS).
- Proficient with CI/CD systems and Git Hub Actions or similar tooling.
- Experience designing and operating distributed systems with high reliability, security, and observability.
- Working knowledge of monitoring and alerting tools (e.g., Prometheus, Grafana, Azure Monitor, Honeycomb, or Datadog).
- Solid understanding of event‑driven systems, RESTful APIs, and caching solutions like Redis.
- Effective communicator with the ability to…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: