×
Register Here to Apply for Jobs or Post Jobs. X

Chapter Manager, SRE Development & Reliability

Job in Toronto, Ontario, M5A, Canada
Listing for: Canadian Tire
Full Time position
Listed on 2026-06-08
Job specializations:
  • IT/Tech
    Cloud Computing, IT Project Manager, IT Support, Systems Engineer
Job Description & How to Apply Below
Reporting to the AVP, Supply Chain Technology, SRE Operations, the Chapter Manager, SRE Operations & Support, will be responsible for ensuring Supply Chain systems are operational and monitored.
This role is also an active participant in all aspects of Production Operational Excellence, including technical vision, telemetry and observation decisions, automation strategy, framework development, solution delivery, incident and problem management.
What youu
2019ll do:
Collaborate with technology leaders and stakeholders to define the SRE strategy and best practices for ensuring the reliability, scalability, and performance of critical systems and services.
Oversee the incident management and response process within the chapter
Establish and enforce monitoring and alerting best practices (Identifying, configuring and tuning of events, logs, metrics and traces) to proactively identify and resolve potential issues before they impact users.
Collaborate with product teams to define appropriate SLOs and SLIs for their services
Encourage the development and of automation and tooling to streamline SRE processes, including incident response, adoption system provisioning, monitoring and alerting, configuration management and knowledge management.
Analysis for new services (in the production or design stage) to align with industry best practices & CTC monitoring framework.
Track and monitor the performance and progress of SRE-related Initiatives. Maintain dashboard to measure, optimize and report on application service performance and availability.
Ensure it maintains functionality, programmability, & observability
Lead regular operational reviews covering performance trends, anomalies, errors, and other availability events with SREs, product owners, and development teams
Work with chapter members to establish and manage on-call rotations effectively
Manage the Problem Management process.
Review root cause analysis report published as part of Incident and Problem management process and foster a culture of solving issues where they originate from, to avoid redundancy.
Collaborate with admins and L3 developers to prioritize problem root cause analysis and fixes based on incident and error analysis to achieve highly reliable infrastructure, systems, and integrations.
Maintain an inventory of all the applications and services provided by the Platform teams;
Coordinate with Chapters Managers from Platform teams to keep all Supply Chain Service Offerings and CMDB in Service Now up to date.
Track and ensure infrastructure and application patches are applied on time;
Manage remediation of security vulnerabilities identified during audit scans in our production environment;
Drive maturity of chapter through promoting and driving the implementing of SRE efficiency improvement and maturity improvement initiatives.
Support delivery leaders in building & maturing the SRE practice

What you bring:

Experience in Incident Management and Problem Management
Experience creating, collecting, tuning & responding to all things monitoring: alerts, events, metrics, tracing & dashboarding
Experience using APM including New Relic
Experience in dashboard development in Service Now and PowerBI
Systems engineering basics including networking, DNS, virtualization, containers, & various OS (Linux, AIX, Windows)
Experience presenting to executive stakeholders
Strong technical & analytical skills in troubleshooting and correlating information.
Previous developer or system/application administrator experience
SRE experience creating and designing meaningful SLO/I/A and error budget definitions

Experience with monitoring, logging & telemetry tools like New Relic, Sumologic, Grafana, Splunk, Azure Monitor or similar
Ability to identify toil and remove redundant tasks leveraging scripting and automation
Excellent ability to liaise with business users, IT personnel, and vendors gathering requirements and delivering solutions
Knowledge and experience in the Supply Chain Industry
Understanding of data and ability to link trends with outcomes.
Willingness and ability to work during non-standard hours, including nights, weekends, and holidays, as necessary to support 24/7…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary