Principal Engineer - Site Reliability
Listed on 2025-11-29
-
IT/Tech
Systems Engineer, Cloud Computing
Our Purpose
At Xero, we’re here to help you supercharge your business. We do this by automating routine tasks, surfacing actionable insights and connecting businesses with the right data, advisors and apps. When that happens, we’re not only making life better for small business, we’ll be building a stronger economy that can change the world.
About the roleAs a Principal Engineer within the Site Reliability Engineering team at Xero, you will be a figurehead within your portfolio, providing leadership in how SRE influences across Xero to provide the most reliable experience for our customers. Leaders, engineers and teams you work with will aspire to emulate you and the people around you will grow in capability and confidence by working with you!
You’ll be a strong communicator with the ability to influence others in a human way, you will take ownership of your portfolio and guide your team to greatness. You'll come with strong business acumen and stakeholder management capabilities, the ability to solve cross-organisation engineering challenges using influence rather than authority to enact change.
We're looking for an expert and evangelist in modern SRE principles as a fundamental requirement for this role; in addition we value people with a broad set of skills, who can share their wealth of knowledge with others to drive change and growth at Xero.
About the teamIn Site Reliability Engineering (SRE), we drive and influence Xero to provide the most reliable experience for our customers. We are a global team based across New Zealand, Australia and the USA.
In SRE at Xero, we combine software and systems engineering to enable engineers across Xero to build and support products that are observable, stable, performant, tolerant to failure, and operate as intended in the face of varying conditions.
We strive to maximise the impact of post incident learning across the organisation to improve the reliability and robustness of the Xero platform, while providing enablement and training across observability, reliability engineering, incident management and service ownership.
We also enable engineers across Xero through developing, supporting and integrating a collection of proprietary and off the shelf tooling to enable incident management and response, incident analysis and learning, monitoring and observability and resource ownership. We surface data and metrics, and provide detailed insights across operational health, production operations and developer productivity.
You'll come with a wide range of skills, including exposure to:- Reliability and distributed systems engineering, including running complex systems at large scale
- Strong strategic delivery experience in either software engineering or platform engineering
- Experience working in environments with more advanced security and networks
- Applying systems thinking and systems engineering to the engineering environment
- Experience applying AI/ML and advanced analytical techniques to drive operational improvements and predictive reliability
- Advanced experience in logging, monitoring and observability of distributed systems, including troubleshooting and service level objectives
- Leading incident management and response, including complex and high severity incidents; post incident reviews, incident analysis and learning from incidents
- Strong hands on experience in Dev Ops, continuous delivery, CI/CD, automated quality and safe deploy and release at scale
- Experience with the implementation and support of SaaS developer tooling commonly used for observability and incident management, such as New Relic, Sumo Logic and/or Pager Duty.
- Taking a multi-year, industry leading perspective, you will ensure that our products are observable, stable, performant, tolerant to failure, and operate as intended in the face of varying conditions
- Building deep cross-functional relationships at all levels of the organisation, breaking down silos to influence for the best reliability outcomes for Xero
- Through curiosity and thoughtful questioning, you will engage in productive challenge, manage different viewpoints and move critical company priorities…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).