Systems Engineer,Production Job Toronto area,Ontario Canada,IT/Tech

Clio is the global leader in legal AI technology, empowering legal professionals and law firms of every size to work smarter, faster, and more securely.

We are transforming the legal experience for all by bettering the lives of legal professionals while increasing access to justice.

Summary

We are currently seeking a Systems Engineer, Production to join our Platform Team. This role is available to candidates across Canada. If you are local to one of our hubs (Burnaby, Calgary, or Toronto) you will be expected to be in office minimum twice per week on one of our Anchor Days.

What your team does

The Systems Engineering team at Clio builds and operates the systems that power all of our cloud-based products and services. We design, automate, and maintain the infrastructure that enables our global teams to deliver reliable, secure, and scalable solutions to thousands of customers every day. Our mission is to make deploying and operating software effortless and safe. We focus on automation, observability, and reliability, ensuring every engineering team at Clio can move faster and with confidence.

You’ll be part of a globally distributed team, collaborating closely with engineers in North America and Europe, driving technical excellence across Clio’s production environments.

Who you are

You’re an experienced cloud engineer who thrives on building infrastructure that just works – scalable, secure, and self‑healing. When you're solving problems, you're just as comfortable writing code as you are digging into cloud architecture. You communicate clearly, enjoy mentoring others, and bring both curiosity and calm to complex challenges.

What you’ll work on

Design, build, and maintain the AWS infrastructure supporting Clio’s production environments.
Implement and evolve Infrastructure‑as‑Code using Terraform, ensuring reproducibility and compliance.
Collaborate with developers to improve CI/CD pipelines, deployment strategies, and overall developer experience.
Enhance observability and reliability, refining alerting, monitoring, and incident response.
Collaborate on cloud optimization projects, improving performance, cost efficiency, and security posture.
Mentor and guide team members, fostering a culture of technical excellence and continuous learning.
Partner with Security, MLOps, and Product Engineering teams to deliver scalable, resilient, and compliant systems.
Define agentic patterns to inflect the way we build, deploy, and observe cloud infrastructure.

What you may have

Experience designing, deploying, and operating AWS infrastructure and managed services (like RDS, EKS, Lambda) at scale.
Expertise with Terraform and infrastructure automation.
Strong coding foundation, beyond scripting, with experience in application development or designing and building robust automation frameworks and tooling.
Experience in container orchestration and cluster management.
Experience with CI/CD systems such as Buildkite, Git Hub Actions, etc.
Experience with relational or non‑relational data stores in a production environment.
Familiarity with observability platforms (Datadog, Prometheus, Grafana).
Familiarity with Linux systems administration, networking, and troubleshooting.
Excellent communication and documentation abilities, with a focus on knowledge sharing and team collaboration.
Growth mindset when it comes to process improvement and new technologies, especially AI.

Serious bonus points if you have

Experience with Kubernetes (EKS) on AWS in production environments.
Expertise with best practices for managing and securing cloud environments at scale.
Prior experience leveraging AI to accelerate yourself and your team through generating code and debugging infrastructure issues.
Prior experience in platform engineering or developer productivity teams.
Prior experience working with software teams building and shipping applications to the cloud.
Hands‑on experience managing and scaling distributed data stores (e.g., MySQL/Aurora) at scale, including troubleshooting database bottlenecks and implementing robust backup and disaster recovery strategies.

What you will find here

Compensation is one of the main components of Clio’s Total Rewards Program.…