Senior Site Reliability Engineer at Confluent
Job in
Toronto, Ontario, C6A, Canada
Listed on 2026-06-22
Listing for:
IBM
Full Time
position Listed on 2026-06-22
Job specializations:
-
IT/Tech
SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Systems Engineer
Job Description & How to Apply Below
This senior role allocates 75% of your time to engineering tasks, enhancing tools, and analyzing failure patterns, while 25% involves coaching and promoting incident response practices. Your contributions are essential for reducing incidents in Confluent's energetic cloud landscape.
Key Responsibilities:
• Analyze failure patterns for proactive reliability design
• Manage configuration of Rootly and key integrations
• Define and uphold SLO/SLA frameworks
• Edit incident documents for customer-facing quality
• Create training programs and guide teams through post-mortems
Requirements:
• 10+ years in SRE, incident management, or reliability engineering
• Proficiency with cloud platforms: AWS, GCP, or Azure
• Expertise in management tools like Rootly
• Strong knowledge of distributed systems
• Advanced experience with large-scale reliability programs
Utilize your skills to enhance Confluent's reliability across multi-cloud systems.
#J-18808-Ljbffr
Position Requirements
10+ Years
work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×