×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer; SRE

Job in Vancouver, BC, Canada
Listing for: The Walt Disney Company
Full Time position
Listed on 2025-12-02
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 124200 - 166700 CAD Yearly CAD 124200.00 166700.00 YEAR
Job Description & How to Apply Below
Position: Staff Site Reliability Engineer (Staff SRE)

Staff Site Reliability Engineer (Staff SRE)

Job  
Location Vancouver, Canada Business Walt Disney Animation Studios Date posted Oct. 31, 2025

Job Summary:

Walt Disney Animation Studios’ world‑class filmmakers, artists, and technical collaborators create the magic of animation. Bring your unique talents, passion and ideas to our team and prepare to play in a creative, artist‑friendly environment.

We are seeking a Staff SRE with expertise in Linux platform systems administration, software development (e.g. Python, Go, Java, Node), CI pipeline tools (e.g. Jenkins), Git source management, cloud hosting (AWS, GCP & Azure), container computing (e.g. Docker, OCI), and web technologies. The ideal candidate will enjoy the diversity and challenges of working at various levels in the foundational deployment stack, from defining configuration management to developing CI/CD infrastructure and processes.

This role resides within the Platform and Infrastructure team at Walt Disney Animation Studios (WDAS), and we build the tools and manage the infrastructure that artists use daily to create our celebrated animated content. The SRE team within Platform Engineering is focused on optimizing service deployments and improving the availability, latency, performance, efficiency, and observability of systems  projects have in common pursuit of simple and performant solutions to complex problems using Agile and Dev Ops methodologies as part of high‑energy, proficient teams.

Critical to success in this role is an aptitude for working collaboratively with a technical team. You will help to develop and drive requirements and strategies while also supporting services and core services infrastructure.

Our studio thrives from a wide variety of technical backgrounds and experiences, so we encourage applicants to apply even if they have experiences not specified below. Bring your unique talents, passion and ideas to our team, and be a part of Disney’s creative legacy!

Responsibilities

As Staff SRE, you will translate ideas into tangible products that shape experiences by focusing on a systematic approach to automation, resiliency, efficiency, stability, security, performance, and capacity management, as well as documentation. You will serve as a subject matter expert in multiple areas and be looked at by your fellow team members as a "go to" individual; you are someone who has a clear understanding of, and can thoroughly elaborate on SRE principles and best practices to a given audience.

To be successful in this role you will continuously uphold and improve all the relevant reliability aspects for our services, with an increased focus on SLIs and SLOs, while raising the reliability of a variety of large‑scale user‑facing and internal services. As Staff SRE, you will maintain a strong understanding of stakeholder workflows and requirements, and then be able to translate the targeted solutions into an end‑to‑end architectural design.

You will work with engineering, creative and production teams in an extremely collaborative and high‑energy environment to brainstorm, architect, gather requirements, troubleshoot, and provide stellar customer support. You are passionate about constantly learning, applying technology to solve complex problems, and are a highly motivated, optimistic, proactive, creative thought leader and project manager.

Additional Responsibilities Include:
  • Support a wide range of on‑premises and cloud deployments using infrastructure‑as‑code, self‑healing, and security automation patterns and can facilitate others to use the Infrastructure as Code paradigm
  • Deploy and manage a wide array of on‑premises and cloud deployments
  • Develop useful telemetry, alerts, and response to reduce Mean Time To Repair (MTTR)
  • Collaborate and provide technical excellence within and across teams
  • Consult on best practices and develop tools to enable smooth adoptions of good service reliability practices and methods
  • Identify areas of improvement in reliability, efficiency, and operations
  • Build tools to help your SRE team quickly pinpoint, isolate and resolve issues related to infrastructure, platform services and applications
  • Continuously refine…
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary