Senior Site Reliability Engineer
Overview
What is the Opportunity? We are on the lookout for a talented Senior Site Reliability Engineer to join our forward-thinking team responsible for the development and enhancement of our CI/CD deployment portal. This platform is designed to facilitate the swift and secure deployment of applications to various cloud environments, supporting all RBC application developers. You will focus on implementing solutions that streamline application delivery, improve operational efficiency, and ensure the platform's reliability and scalability, while leveraging AI-driven tools and methodologies.
ResponsibilitiesEnsure the performance, quality, and responsiveness of the platform, with a strong emphasis on Site Reliability Engineering (SRE) principles, including robust monitoring, alerting, and incident response practices.
Maintain and enhance the operational capabilities of the platform, ensuring an intuitive user experience while enabling seamless integration with tools and services to proactively identify and resolve potential issues.
Collaborate with cross-functional teams to implement and deliver features for our deployment platform, focusing on automation, scalability, and operational efficiency, with integration of AI-driven solutions.
Implement deployment and management patterns for the various tools on our Dev Ops platform, optimizing resource allocation and deployment strategies, with use of AI to improve processes.
Integrate with cloud services and infrastructure to guarantee secure and efficient application deployment, while exploring opportunities to enhance security and scalability.
Develop and execute automated testing procedures to confirm platform stability and dependability, improving test coverage and identifying edge cases.
Participate in code review processes and contribute to the collective knowledge by documenting technical procedures and methodologies, including those involving AI tools.
Stay informed of emerging development practices and technologies, actively contributing to the ongoing enhancement of our technology stack and platform capabilities.
The role requires providing on-call support.
Must-Have
Skills:
3+ years of working experience in Site Reliability Engineering (SRE) and best practices for running and maintaining critical systems, including monitoring, alerting, and incident management.
Experience with implementing and deploying systems into integrated environments.
Proficient with cloud-based services (e.g., AWS, Azure) and a strong grasp of developing cloud-native applications.
Proficiency with Terraform for Infrastructure as Code (IaC).
Solid understanding of version control systems, particularly Git.
Strong analytical skills, problem-solving abilities, and excellent communication skills.
Nice-to-Have
Skills:
Bachelor’s degree in Computer Science, Engineering, or in a field relevant to the role.
Experience with full stack development, including experience with frameworks and languages such as JavaScript, React, Node.js, Python, or similar.
Knowledge of Continuous Integration/Continuous Delivery (CI/CD) methodologies and associated tools.
Familiarity with container technologies like Docker and orchestration platforms like Kubernetes.
Experience using AI tools and efficient prompting of LLMs for operational improvements.
Understanding of how AI can enhance CI/CD processes and operational workflows, and experience working with models and MCP servers.
A focus on improving operational efficiency and system reliability, with experience leveraging AI for these purposes.
We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.
A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable.
Leaders who support your development through coaching and managing opportunities.
Ability to make a difference and lasting impact.
Wor…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: