Job Description & How to Apply Below
Job
-480
Come Join Our Passionate Team!
Come join our passionate team! Barracuda is a leading cybersecurity company providing complete protection against complex threats. Our platform protects email, data, applications, and networks with innovative solutions, and a managed XDR service, to strengthen cyber resilience. Hundreds of thousands of IT professionals and managed service providers worldwide trust us to protect and support them with solutions that are easy to buy, deploy, and use.We are committed to a candidate selection process and work environment that is inclusive and barrier free. To ensure candidates are assessed in a fair and equitable manner, accommodations will be provided to prospective employees in accordance with the Accessibility for Ontarians with Disabilities Act (AODA) and the Ontario Human Rights Code.
Envision Yourself at Barracuda:
We are seeking a strategic and visionary Director of Site Reliability Developers (SRD), in the Cloud Operations group, to lead global reliability initiatives across Barracuda’s SaaS portfolio. You will oversee a distributed team of Site Reliability Developers and partner closely with Product Engineering, Security & Compliance, and other Cloud Operations teams to ensure our platforms are highly available, scalable, secure, and cost-efficient.This role will also drive AI-powered automation and agentic systems adoption to transform reliability operations.
What will you be working on:
Define and execute Barracuda’s global SRE strategy, aligning reliability goals with business objectives and customer SLAs.
Drive continuous improvement in availability, latency, performance, and cost optimization across all cloud services.
Implement AI-driven observability and anomaly detection for proactive incident prevention; deploy agentic automation systems to manage routine operational tasks, optimize cloud resources, and accelerate remediation workflows; explore LLM-based runbooks and autonomous agents for incident triage and root cause analysis.
Partner with Engineering, Security, and Fin Ops teams to embed reliability into product design and delivery pipelines.
Influence architectural decisions for reliability, disaster recovery, and observability systems; ensure compliance with security and regulatory standards.
Champion Infrastructure-as-Code and CI/CD automation at scale using Terraform, Cloud Formation, Git Hub Actions, and Jenkins.
Risk Management:
Facilitate incident response protocols, conduct executive-level postmortems, and implement proactive risk mitigation strategies.
Define and enforce SLIs and SLOs across global services; report reliability metrics to executive leadership.
Build and mentor a high-performing SRE organization; foster a culture of ownership, innovation, and collaboration across regions.
Lead initiatives for cost governance and performance tuning in AWS and Azure environments.
Present reliability roadmaps, KPIs, and risk assessments to senior leadership and stakeholders.
What you bring to the role:
12+ years in infrastructure, cloud operations, or SRE roles, including 5+ years in leadership positions managing distributed teams.
Deep knowledge of AWS and Azure architectures, security, and operations in large-scale SaaS environments.
Experience implementing AI-driven observability, predictive analytics, and autonomous remediation systems.
Proven success implementing such as Terraform or Cloud Formation at enterprise scale.
Advanced experience with Git Hub Actions, Jenkins, and deployment strategies (blue/green, canary, rolling).
Expertise in Kubernetes (EKS, AKS) and containerized workloads.
Strong background in Prometheus, Grafana, ELK, and APM tools; experience designing self-healing systems.
Proficiency in Python, Go, or similar languages for automation and tooling.
Skills:
Exceptional ability to lead globally distributed teams, influence cross-functional stakeholders, and drive cultural change.
AWS Solutions Architect/Dev Ops Professional and Kubernetes certifications (CKA, CKAD) preferred.
What You Will Get from Us:
The anticipated on-target earnings range for this role is $180,000 to $241,00 CAD. Actual compensation offered will be dependent upon the individual's skills, experience, and qualifications
#LI-hybrid
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×