Chapter Manager, SRE Development & Reliability
Job in
Toronto, Ontario, C6A, Canada
Listed on 2026-06-08
Listing for:
Canadian Tire Corporation
Full Time
position Listed on 2026-06-08
Job specializations:
-
IT/Tech
Cloud Computing, IT Support, Systems Engineer, IT Project Manager
Job Description & How to Apply Below
Reporting to the AVP, Supply Chain Technology, SRE Operations, the Chapter Manager, SRE Operations & Support, will be responsible for ensuring Supply Chain systems are operational and monitored. This role is also an active participant in all aspects of Production Operational Excellence, including technical vision, telemetry and observation decisions, automation strategy, framework development, solution delivery, incident and problem management.
What you’ll do:- Collaborate with technology leaders and stakeholders to define the SRE strategy and best practices for ensuring the reliability, scalability, and performance of critical systems and services.
- Oversee the incident management and response process within the chapter.
- Establish and enforce monitoring and alerting best practices (identifying, configuring and tuning of events, logs, metrics and traces) to proactively identify and resolve potential issues before they impact users.
- Collaborate with product teams to define appropriate SLOs and SLIs for their services.
- Encourage the development and adoption of automation and tooling to streamline SRE processes, including incident response, system provisioning, monitoring and alerting, configuration management and knowledge management.
- Analysis for new services (in the production or design stage) to align with industry best practices and CTC monitoring framework.
- Track and monitor the performance and progress of SRE-related initiatives. Maintain dashboard to measure, optimize and report on application service performance and availability.
- Lead regular operational reviews covering performance trends, anomalies, errors, and other availability events with SREs, product owners, and development teams.
- Work with chapter members to establish and manage on‑call rotations effectively.
- Manage the Problem Management process. Review root cause analysis reports and foster a culture of solving issues where they originate.
- Collaborate with admins and L3 developers to prioritize problem root cause analysis and fixes based on incident and error analysis.
- Maintain an inventory of all applications and services; coordinate with Chapter Managers from Platform teams to keep Supply Chain Service Offerings and CMDB in Service Now up to date.
- Track and ensure infrastructure and application patches are applied on time.
- Manage remediation of security vulnerabilities identified during audit scans in the production environment.
- Drive maturity of chapter through promoting and implementing SRE efficiency and maturity improvement initiatives.
- Support delivery leaders in building and maturing the SRE practice.
- Experience in Incident Management and Problem Management.
- Experience creating, collecting, tuning and responding to monitoring: alerts, events, metrics, tracing and dashboarding.
- Experience using APM including New Relic.
- Experience in dashboard development in Service Now and Power
BI. - Systems engineering basics including networking, DNS, virtualization, containers, and various OS (Linux, AIX, Windows).
- Experience presenting to executive stakeholders.
- Strong technical and analytical skills in troubleshooting and correlating information. Previous developer or system/application administrator experience.
- SRE experience creating and designing meaningful SLO/I/A and error budget definitions.
- Experience with monitoring, logging and telemetry tools like New Relic, Sumologic, Grafana, Splunk, Azure Monitor or similar.
- Ability to identify toil and remove redundant tasks leveraging scripting and automation.
- Excellent ability to liaise with business users, IT personnel, and vendors gathering requirements and delivering solutions.
- Knowledge and experience in the Supply Chain Industry.
- Understanding of data and ability to link trends with outcomes.
- Willingness and ability to work during non-standard hours, including nights, weekends, and holidays, to support 24/7 operational needs.
- Familiarity with cloud platforms is an asset.
- Experience with Jira, Confluence, Service Now is an asset.
- Familiarity with microservices architecture and system integrations is an asset.
- Familiarity with Dev Ops Practices is an asset.
- Knowledge of Retail and Supply Chain Business is an asset.
- Good understanding of SAFe methodology is an asset.
Broadband Salary Range: $79,000 – $131,000. Typical hiring range: $79,000 – $110,000.
Benefits:- Comprehensive benefits and retirement programs
- Performance incentives and continuing education programs
- Mental health coverage of $5,000 per year for eligible employees and families
- Career growth opportunities and product discounts
- Canadian Tire Profit Sharing
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×