Manager Technology SRE
Listed on 2026-06-04
-
IT/Tech
Cloud Computing, IT Project Manager, SRE/Site Reliability, Systems Engineer
Pay Range: $100,000 - $130,000
At The Home Depot Canada, we want you to feel valued and supported. The pay range you see represents base salary only. In addition, your total rewards may include: semi-annual bonuses tied to business performance;
Deferred Profit-Sharing Program to assist with retirement savings; comprehensive paid benefits; a 15% discount on Home Depot stock purchases; and merit-based salary increases. We are committed to recognizing your efforts and supporting your growth with us.
Position:
Manager, Technology Site Reliability Engineering (SRE) - eCommerce
Position Overview:
The Manager, SRE will lead a team of Site Reliability Engineers to ensure the reliability, performance, and operational support of our eCommerce systems, with a focus on Google Cloud Platform (GCP) environments. This role requires a strong background in reliability reviews, performance engineering practices, production engineering, and operational support, with emphasis on Dev Ops principles and GCP expertise.
Responsibilities:
- Leadership & Management:
- Lead and mentor a team of Site Reliability Engineers
- Foster a culture of continuous improvement and innovation
- Collaborate with cross-functional teams to align SRE practices with business objectives
- Reliability & Performance:
- Conduct reliability reviews to identify areas for improvement and implement solutions to enhance system reliability, particularly in GCP environments
- Implement and promote performance engineering practices to ensure optimal system performance on GCP
- Develop and maintain service level objectives (SLOs) and error budgets
- Production Engineering & Operational Support:
- Oversee production engineering efforts to ensure systems are designed for operational excellence and reliability, leveraging GCP services and best practices
- Manage incident response and post-incident reviews to minimize downtime and improve system resilience
- Implement monitoring, alerting, and observability solutions to proactively identify and address issues
- Develop and maintain runbooks and playbooks for common operational tasks.
- Coordinate with security teams to ensure compliance with security policies and best practice
- Dev Ops & Continuous Improvement:
- Drive Dev Ops initiatives to improve collaboration between development and operations teams, with a focus on GCP-native tools and services
- Implement and maintain CI/CD pipelines to streamline deployment processes in GCP environments
- Identify and implement automation opportunities to reduce manual tasks and improve efficiency
- Promote the use of Infrastructure as Code (IaC) to manage and provision cloud resources.
- Continuously evaluate and integrate new tools and technologies to enhance Dev Ops practices
- Release Management:
- Implement and maintain release management best practices to minimize disruptions and maximize system stability
- Collaborate with Dev Ops teams to integrate release management into CI/CD pipelines
- Oversee release schedules, ensuring minimal impact on business operations
- Ensure there is a rigorous release readiness process in place that includes reviews and post-release retrospectives
- Maintain a release calendar and communicate release plans to stakeholders
- Strategic Planning:
- Create and maintain a strategic roadmap for SRE initiatives, aligning with business goals and technological advancements.
- Refine and standardize Standard Operating Procedures (SOPs) to enhance operational efficiency and consistency.
- Address customer pain points by developing and implementing solutions that improve user experience and system reliability.
- Engage with stakeholders to understand their needs and incorporate feedback into strategic planning and execution
- Monitor industry trends and best practices to ensure the SRE team remains at the forefront of technology.
Experience:
- Bachelor’s degree in computer science, Engineering, or a related field
- Strong problem-solving and analytical abilities
- Excellent communication and collaboration skills
- 4-6 years of relevant work experience, including significant experience with GCP
- Extensive experience with cloud infrastructure, GCP services and architecture
- Proven track record of managing and optimizing large-scale systems on GCP
- Proven ability to effectively communicate with individuals at all levels of the organization
- Ability to maintain relationship and negotiate with vendors.
- Ability to operate in and leverage resources in a matrixed environment.
- Ability to analyze and present data to support ideas.
- Ability to clearly communicate to all levels of the organization.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: