Job Senior SRE Engineer - Cloud Operations,Jobs Berlin Berlin,Stellenangebote in Deutschland,IT/Informationstechnik,Qdrant

Qdrant is a cutting-edge vector database company on a mission to revolutionize how organizations manage and query unstructured data. Our open-source engine and managed cloud solutions power AI-driven search, recommendation, and data discovery are a remote-first company, building a global team of passionate engineers to push the boundaries of database infrastructure.

As a Senior Dev Ops / SRE Engineer on the Cloud Operations team
, you will focus on keeping Qdrant Cloud reliable, observable, and secure as usage and infrastructure complexity grow. Your primary responsibility is operational excellence
: stability, incident response, and continuous improvement of production systems.

This role is operations-heavy
, ideal for engineers who thrive in owning reliability and reducing operational risk at scale.

Tasks

Operate and maintain production cloud infrastructure at scale
Own Kubernetes infrastructure, networking, and deployment pipelines
Improve monitoring, logging, alerting, and operational visibility
Lead incident response, root cause analysis, and follow-up actions
Reduce operational toil through automation and better tooling
Improve reliability, security, and performance of production systems
Collaborate closely with Platform and Regions & Clusters teams
Maintain and evolve runbooks, operational procedures, and alerts
Participate in on-call rotations and continuous reliability improvements

Requirements

Must have

5+ years of experience in Dev Ops, SRE, or infrastructure operations roles
Strong hands‑on experience operating Kubernetes in production
Solid knowledge of Linux systems, networking, and cloud infrastructure
Experience working with AWS, GCP, or Azure
Strong understanding of monitoring, alerting, and incident management
Experience with infrastructure‑as‑code and automation tooling
Comfortable owning on‑call responsibilities and production incidents
Strong operational mindset and clear communication skills

Nice to have

Experience with Terraform or similar IaC tools
Familiarity with Prometheus, Grafana, Loki, or Open Telemetry
Exposure to security, compliance, or hardening initiatives
Scripting experience in Python, Bash, or Go
Experience in SaaS, cloud, or data infrastructure environments

Benefits

Competitive salary, equity, and benefits
Fully remote setup with flexible working hours
Clear ownership of reliability and operational excellence
Opportunity to work on mission‑critical customer‑facing infrastructure
Strong collaboration with platform and engineering teams

If you enjoy keeping complex systems reliable and improving operations through automation and discipline, we’d love to hear from you.

Recruiting Agencies and Headhunters, please only via 𝙝𝙞𝙧𝙚𝙗𝙪𝙛𝙛𝙚𝙧.𝙘𝙤𝙢?=qdrant

#J-18808-Ljbffr


Increase search radius (miles)



Sprache der Stellenausschreibung