More jobs:
Job Description & How to Apply Below
The Site Reliability Engineering (SRE) team at ESO is responsible for ensuring the reliability, scalability, and performance of our production systems. We operate at the intersection of engineering and operations, with a strong focus on automation, observability, and continuous improvement.
As a Site Reliability Engineer, you will work hands‑on with cloud‑native systems, supporting production and pre‑production environments to maintain system health, improve resiliency, and optimize performance. You’ll partner closely with engineering, infrastructure, and database teams to troubleshoot complex issues, enhance automation, and ensure our services meet reliability and availability expectations.
This role is ideal for an engineer who enjoys solving challenging problems, digging into application and database behaviour, and continuously improving how systems operate in a fast‑paced, high‑impact environment.
What You’ll Do
Support and maintain production and non‑production cloud environments (Cloud Azure/AWS).
Troubleshoot complex, distributed, cloud‑based applications to identify root causes and implement durable fixes.
Monitor system health, performance, and reliability using observability tools (e.g., New Relic, ELK and Zabbix).
Investigate application and database performance issues, including writing and optimising SQL queries.
Participate in incident response, debugging, and post‑incident reviews focused on continuous improvement.
Contribute to CI/CD pipelines (e.g., Azure Dev Ops) to improve automation, reliability, and deployment processes.
Write and maintain automation scripts (Power Shell, bash, Python or similar) to streamline operational workflows.
Collaborate with developers to understand code behaviour and support troubleshooting efforts in C#/.NET‑based systems.
Help improve reliability standards, documentation, and operational best practices.
What We’re Looking For
Hands‑on experience working in a cloud environment (Microsoft Azure strongly preferred).
Experience supporting and troubleshooting complex, cloud‑native applications in production environments.
Strong understanding of relational databases and solid experience writing and troubleshooting SQL queries.
Ability to read and understand application code (preferably C#/.NET) to support debugging and issue resolution.
Experience working with at least one CI/CD platform (e.g., Azure Dev Ops).
Familiarity with monitoring and observability tools (e.g., New Relic) and core concepts such as logs, metrics, and traces.
Experience with scripting/automation (Power Shell preferred).
Strong analytical and problem‑solving skills with attention to detail.
Clear written and verbal communication skills.
Who You Are
Passionate about reliability engineering and operational excellence.
Curious and eager to learn, you actively seek feedback and continuously grow your technical skill set.
Coachable and adaptable, able to thrive in a fast‑paced and evolving environment.
Comfortable navigating ambiguity and taking ownership of problems through to resolution.
A collaborative team player who values accountability and continuous improvement.
Nice to Have
Experience working with Linux‑based systems.
Experience working with Kubernetes and container systems.
Exposure to infrastructure‑as‑code tools (e.g., Terraform).
Familiarity with Git‑based version control workflows.
At ESO, reliability is core to our customer experience. As part of the SRE team, your work will directly impact system stability, product performance, and the quality of service delivered to healthcare professionals and the communities they serve.
Benefits & Perks
Competitive health plan (medical, dental, & vision insurance)
RRSP with company match
Telemedicine service provided by ESO
Front‑loaded vacation and sick time
Employee Assistance Program (EAP)
Peace of mind benefits such as life insurance and disability insurance
Casual office environments
Unlimited office snacks and drinks
About ESO
ESO is a fast‑paced, growing data, technology, and research company passionate about improving community health and safety through the power of data. We pioneer innovative, user‑friendly software to meet the changing needs of…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×