×
Register Here to Apply for Jobs or Post Jobs. X

Senior DevOps Engineer

Job in Cardiff, Cardiff City Area, CF10, Wales, UK
Listing for: Digital Science
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer
Job Description & How to Apply Below

Join to apply for the Senior Dev Ops Engineer role at Digital Science
.

Get AI-powered advice on this job and more exclusive features.

About Us

We are Digital Science and we are advancing the research ecosystem. We are a pioneering technology company, and our vision is of a future where a trusted and collaborative research ecosystem drives progress for all. We believe in better, open, collaborative and inclusive research. In creating the next generation of tools and working in partnership with the community we tackle some of the biggest challenges to research.

In order to achieve our vision, we need innovative, inspiring and dynamic people to join our team. Want to join us?

Your New Role:
Senior Dev Ops Engineer, focusing on Overleaf Infrastructure

We are recruiting for a Senior Dev Ops Engineer within the wider Digital Science Product organization, where you will directly support one of our most critical and high‑profile products:
Overleaf
. We're looking for a talented Senior Dev Ops Engineer to join our team and help us maintain the reliability, scalability, and performance of the systems that power Overleaf’s most critical platforms. Operating primarily on Google Cloud (GCP), you will use your knowledge of distributed systems and architecture to ensure smooth, global operations and improve overall system health. You will work closely with cross‑functional teams to identify and mitigate risks, supporting platforms that require world‑class reliability and automation.

What you’ll be doing
  • GCP Infrastructure Ownership:
    Own our infrastructure on Google Cloud Platform and the Terraform codebase, managing critical components including VPCs, Compute Engine, Kubernetes Clusters, Cloud SQL/Redis, Load balancers, Cloud Armor, logging/monitoring pipelines, and IAM.
  • Automation & CI/CD:
    Build and optimize CI/CD pipelines using Jenkins or similar tools, and automate routine operations with shell scripts where appropriate.
  • Reliability & Monitoring:
    Implement and manage monitoring, alerting, and incident response systems using Google Cloud Monitoring and similar tools. Participate in a rotating on‑call schedule for critical infrastructure issues outside normal business hours.
  • Database Management:
    Ensure the performance, reliability, and uptime of Postgre

    SQL and Mongo databases with proactive monitoring and tuning.
  • Cost Management:
    Oversee resource usage on GCP to ensure efficient cost management.
  • Collaboration & Knowledge Sharing:
    Take a collegial approach to sharing knowledge with engineers, building consensus for change, and writing excellent documentation.
What you’ll bring to the role Essential Experience
  • Cloud & Containers:
    Significant working knowledge of cloud‑computing environments such as GCP or AWS. Strong hands‑on expertise in Kubernetes and Docker.
  • Infrastructure as Code (IaC):
    Strong hands‑on expertise in Terraform.
  • Operating Systems & Scripting:
    Solid Linux/Unix systems knowledge and scripting skills (Bash/Python).
  • Dev Ops Tooling:
    Experience with CI/CD tools (e.g., Jenkins) and monitoring platforms (e.g., Grafana, Google Cloud Monitoring).
  • Database Expertise:
    Experience working with databases such as Mongo, Postgre

    SQL, and Redis.
  • SRE Practice:
    Know how to implement best‑practice alerting, monitoring, and observability on applications that experience high load.
  • Incident Management:
    An excellent track record of dealing with production incidents and post‑incident analysis.
  • Agile:
    Significant experience working in an Agile methodology and implementing best practices in version control and code review.
Mindset
  • A security‑first mindset at all times, covering confidentiality, integrity, and availability.
  • A commitment to staying up‑to‑date with emerging technologies and implementing innovative cloud solutions.
Nice to Have
  • Understand error budgets, SLI, and SLOs.
  • Understand how to manage cloud computing costs effectively.
  • Experience coding in a language such as JavaScript.

Don't worry if you don't meet every qualification—let us be the judge! Studies show that many qualified candidates from under‑represented groups hesitate to apply unless they meet every single requirement. We are dedicated to building a diverse and…

Position Requirements
10+ Years work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary