Job Senior Site Reliability Engineer,Jobs München Bayern,Stellenangebote in Deutschland,IT/Informationstechnik,Stott and May

Senior SRE Engineer - (m/f/d)

Remote or Munich (Hybrid)

About the Company

We are a technology company providing large-scale digital media streaming services across multiple platforms and devices. We operate the full content delivery pipeline end-to-end, including live streaming, on-demand video, recording services, and multi-device playback.

Our team runs a highly available, low-latency streaming platform serving millions of users across the country. We design, operate, and continuously optimize the systems responsible for transporting and delivering media efficiently and reliably.

About the Role

We are looking for an experienced Site Reliability Engineer to help us enhance and scale our core service infrastructure. As part of the SRE team, you will work on building resilient systems, improving performance and observability, and ensuring smooth operation of our highly available platform.

You will play a key role in designing reliable services, automating operational processes, and maintaining critical components of a large-scale streaming environment.

Your Responsibilities

Design, improve, and maintain systems to enhance stability, scalability, availability, and latency
Work collaboratively to troubleshoot and solve issues in highly available production environments
Own the architecture and reliability of our central Kubernetes platform
Monitor system health and participate in on‑call rotation to manage incidents
Enable product teams to build microservices using CNCF tools (e.g., Kubernetes, Prometheus, Open Telemetry)
Develop automation and tooling to prevent incidents and streamline operational workflows

Your Profile

Strong experience with containers and managing Kubernetes clusters
Hands‑on experience with Terraform and infrastructure automation
Ability to design and implement APIs (REST or gRPC)
Proficient in a backend programming language (ideally Go)
Experience with cloud providers such as AWS or GCP
Proven track record operating large‑scale, distributed systems
Solid understanding of Linux, networking fundamentals, and system‑level debugging
Fluency in German or English
Willing to travel occasionally for in‑person meetings

What We Offer

Work remotely or from our office in Germany
Contribute to a high‑impact product used by millions
Modern and scalable technology stack
A genuinely agile environment with short iteration cycles
Real ownership and autonomy within a strong engineering culture
Opportunities to learn and grow alongside experienced teammates

Benefits

Remote‑first setup and flexible working hours
Regular company events and team gatherings
Personal learning and development budget
Comprehensive benefits package
30 days of vacation per year

#J-18808-Ljbffr


Increase search radius (miles)



Sprache der Stellenausschreibung