Site Reliability/Gitops Engineer
Listed on 2025-12-27
-
IT/Tech
Cloud Computing, Systems Engineer
Site Reliability / Gitops Engineer – Canonical
Join to apply for the Site Reliability / Gitops Engineer role at Canonical
Canonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Canonical’s Ubuntu platform is widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT, with customers that include the world's leading public cloud and silicon providers and industry leaders in many sectors. Canonical is a pioneer of global distributed collaboration, with over 1,200 colleagues in more than 75 countries and very few office‑based roles.
Teams meet two to four times yearly in person in interesting locations around the world to align on strategy and execution.
This role is available remotely in any timezone.
Job SummaryThe IS team at Canonical supports and maintains all of Canonical's IT production services, which run services used by over 60 million Ubuntu users. As an SRE & Git Ops engineer you will drive operations automation and infrastructure as code in both our private and public clouds, leveraging open‑source infrastructure as code tools and CI/CD pipelines. You will also improve Canonical products and open‑source technologies by providing feedback, submitting bugs, and collaborating on design and implementation across teams.
Responsibilities- Apply your IaC experience to develop and improve infrastructure as code practices within IS, increasing automation and streamlining IaC processes.
- Automate software operations for re‑usability and consistency across private and public clouds, considering the complexities of distributed systems.
- Develop new features and improve the resilience and scalability of Canonical’s current cloud and container portfolio.
- Maintain operational responsibility for Canonical’s core services, networks, and infrastructure.
- Build and maintain observability tools such as Prometheus, Grafana, and Elasticsearch; design, implement, and maintain monitoring and alerting for various systems and services.
- Collaborate with development teams to design service architecture, documentation, playbooks, policies, and operational procedures.
- Provide assistance and work with globally distributed engineering, operations, and support peers.
- Allocate uninterrupted development time to focus on larger projects and automation of manual tasks.
- Share your experience, know‑how, and best practices with team members in design sessions, mentorship, and collaborative work.
- Hold final responsibility for time‑critical escalations.
- A deep experience of defining operations in code, using version control, peer review, and CI/CD to roll out changes to applications and infrastructure.
- Strong modern engineering background, including peer‑review, unit testing, SCM, CI/CD, and Agile.
- Python software development experience on large projects.
- Practical knowledge of Linux networking, routing, and firewalls.
- Affinity with various Linux storage technologies, from Ceph to databases.
- Hands‑on experience administering enterprise Linux servers.
- Extensive knowledge of cloud computing concepts and technologies.
- Bachelor’s degree or greater, preferably in computer science or a related engineering field.
- Excellent communication skills in English for email, chat, video or voice calls, and in‑person meetings.
- Motivated to troubleshoot from kernel to web and willing to ask others when appropriate.
- Willingness to be flexible and learn new things quickly.
- Passionate about the needs of fast‑changing environments and distributed teams.
- Deep familiarity with open‑source, especially Ubuntu or Debian systems.
Canonical is an equal opportunity employer. We are proud to foster a workplace free from discrimination. Diversity of experience, perspectives, and background creates a better work environment and better products. Whatever your identity, we will give your application fair consideration.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).