Site Reliability Engineer
Listed on 2026-02-16
-
IT/Tech
Systems Engineer, SRE/Site Reliability
We are supporting a German technology company operating mission-critical, cloud-native platforms that serve internal engineering teams and external customers at scale.
The company treats reliability as a product feature
, not an afterthought, and is now strengthening its platform team with a Senior Site Reliability Engineer who will take real ownership of stability, observability, and performance.
This is a fully remote role within Germany and requires professional German language skills
.
As a Senior SRE, you will apply software engineering principles to infrastructure and operations challenges.
You will work closely with platform and development teams to ensure systems are:
- Highly available
- Automated by default
This is not a ticket-based operations role. You will influence how reliability is designed, measured, and improved across the platform.
Key Responsibilities- Own and improve system reliability, uptime, and performance
- Design and operate observability stacks (metrics, logs, traces)
- Define and implement SLIs, SLOs, and error budgets
- Conduct load testing, performance tuning, and capacity planning
- Reduce operational toil through automation and tooling
- Lead or contribute to incident response and post-incident reviews
- Collaborate closely with engineers to embed reliability-by-design
- Docker & Helm for packaging and deployment
- Python or Type Script for automation and tooling
- Modern monitoring and observability platforms
- Cloud-native and container-first architecture
- Strong experience as a Site Reliability Engineer, Platform Engineer, or senior Dev Ops engineer
- Hands-on production experience with Kubernetes
- Solid understanding of observability and incident management
- Automation mindset and comfort writing production-quality code
- Calm, methodical approach to problem-solving in live environments
- German language proficiency (spoken and written)
- Fully remote (Germany)
- High ownership and technical influence
- Clear commitment to SRE best practices
- Engineering-driven culture with minimal bureaucracy
- Real-world scale and meaningful reliability challenges
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).