×
Register Here to Apply for Jobs or Post Jobs. X

Engineering Manager, Site Reliability

Job in Toronto, Ontario, C6A, Canada
Listing for: Relayfi
Full Time position
Listed on 2026-02-04
Job specializations:
  • IT/Tech
    Systems Engineer, SRE/Site Reliability, IT Support, Cloud Computing
Salary/Wage Range or Industry Benchmark: 100000 - 125000 CAD Yearly CAD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

As Relay continues to scale, the reliability, performance, and resilience of our platform are no longer just technical concerns. They are core to our customer experience and business success.

Relay is building a platform that SMBs can rely on every day, and our Site Reliability Engineering team is at the core of making that possible. We’re looking for an Engineering Manager to lead our SRE team, responsible for the scalability, reliability, and robustness of Relay’s systems. This role is about more than running infrastructure or responding to incidents. It’s a leadership role at the intersection of technology, people, and the business.

You’ll guide a highly capable SRE team, help shape reliability shows up across the organization, and ensure our systems scale safely as customer impact and complexity increase.

If you’re excited by technically demanding environments and energized by building strong teams, healthy culture, and effective cross-functional partnerships, this role is built for you.

What You'll Be Doing
  • Lead, coach, and grow a high-performing Site Reliability Engineering team; support career development, technical excellence, and ownership

  • Own the reliability, scalability, and performance of Relay’s platform, ensuring our systems are resilient as the business grows

  • Lead and evolve SRE best practices, including incident management, on-call operations, SLIs/SLOs, and error budgets

  • Partner closely with Engineering,, and Data teams to ensure reliability and scalability are built into every feature we ship

  • Drive continuous improvement through post-incident reviews, root cause analysis, and preventative action

  • Guide infrastructure and platform investments to support long-term scalability, security, and operational efficiency

  • Define and track key reliability KPIs (e.g., uptime, latency, incident frequency, MTTR) and use data to inform priorities and decisions

  • Champion a culture of learning, operational excellence, and “running towards problems” across the engineering organization

Who You Are
  • You have 3+ years of experience managing or leading engineers and 6+ years of experience in Site Reliability Engineering, Platform Engineering, or Infrastructure roles

  • You have a strong track record of owning and improving system reliability, scalability, and performance in production environments

  • Experienced in improving observability, performance, or operational maturity at growing companies

  • You’ve led teams through incident response, postmortems, and reliability improvements, using data and clear accountability to drive better outcomes

  • You have a strong foundation in operating and scaling production systems in cloud environments (e.g. AWS), and modern infrastructure practices (IaC, CI/CD, monitoring, alerting)

  • You have a proven record of partnering with Product and Engineering leaders to balance delivery velocity with long-term reliability and operational excellence

  • You’re a leader who knows how to coach, motivate, and grow engineers while setting a high bar for ownership, quality and technical excellence

  • You’re highly collaborative and experienced in leading cross-functional initiatives that span engineering, product, and operations

  • You thrive in fast-paced, ambiguous environments and are comfortable leading through change as the platform and organization scale

Bonus Points
  • You’ve built or scaled an SRE or platform team at a growing fintech startup

  • You have experience supporting high-availability, customer-facing systems in fintech or other regulated environments

  • You’re familiar with SRE best practices such as SLOs, error budgets, and capacity planning at scale

The Interview Process:

  • Stage 1: A1-hour Google Meet call with a member of the People team

  • Stage 2: A 1-hour Google Meets video call with the hiring manager (VP of Engineering)

  • Stage 3: A 1-hour in-person interview with a member of Relay’s senior leadership team

  • Stage 4: A 1-hour live technical assessment via Google Meet, or in person, with a panel of interviewers (technical leaders & stakeholders)

Our Commitment To You:

  • Competitive salary and meaningful equity: Relay employees are Relay owners, complete with equity and a competitive salary.

  • Comprehensive health benefits:

Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary