×
Register Here to Apply for Jobs or Post Jobs. X

Senior Site Reliability Engineer - Observability

Job in Poland, Androscoggin County, Maine, 04274, USA
Listing for: Moderna
Full Time position
Listed on 2025-12-07
Job specializations:
  • IT/Tech
    Cloud Computing, Systems Engineer
Job Description & How to Apply Below
Location: Poland

Overview

The Role: Joining Moderna offers the unique opportunity to be part of a pioneering team that's revolutionizing medicine through mRNA technology, with a diverse pipeline of development programs across various diseases. As an employee, you'll be part of a continually growing organization, working alongside exceptional colleagues and strategic partners worldwide, contributing to global health initiatives. Moderna's commitment to advancing the technological frontier of mRNA medicines ensures a challenging and rewarding career experience, with the potential to make a significant impact on patients' lives worldwide.

Moderna is solidifying its presence within our international business services hub in Warsaw, Poland, a city renowned for its rich scientific and technological heritage. This hub provides critical functions, meeting the growing demand of Moderna’s global business operations. We re inviting professionals from around the world to join our mission and contribute to the future of mRNA medicines.

We’re seeking a Senior Site Reliability Engineer – Observability with deep expertise in designing, building, and operating observability solutions across application, database, host, and container environments. In this role, you will lead the development of a modern, open-source observability platform – leveraging technologies such as Grafana, Prometheus, or similar – that is scalable, resilient, and cost-effective. This platform will form the foundation for enterprise-wide monitoring and log management, empowering teams to gain actionable insights, optimize performance, and improve system reliability.

This is a high-impact role for a self-starter who takes initiative and drives outcomes, with ownership spanning observability platforms, governance, agent fleet management, automation, and Fin Ops practices – shaping how Moderna advances its observability strategy in a rapidly growing global enterprise.

Responsibilities

Your Key Responsibilities Will Be

Platform Ownership & Operations
  • Manage and advance Moderna’s enterprise observability platform with a focus on open-source and SaaS observability technologies (Grafana, Prometheus, Loki, Tempo, Jaeger, Open Telemetry, Dynatrace, Splunk, etc.).

  • Lead governance, agent fleet management, and Fin Ops optimization to ensure the platform is scalable, cost-effective, and compliant with enterprise requirements.

  • Balance hands-on engineering work (building, configuring, and operating the platform) with strategic ownership (roadmap influence, governance, cost optimization).

  • Collaborate with vendors and open-source communities to influence feature roadmaps and maximize platform value.

Observability Engineering
  • Design and build highly scalable, resilient, and cost-optimized observability architectures to support application, database, host, and container monitoring.

  • Implement telemetry pipelines for metrics, traces, and logs using Grafana, Prometheus exporters (e.g., Node, Blackbox), Kubernetes instrumentation, distributed tracing, or similar technologies.

  • Establish and evolve best practices for monitoring, alerting, SLOs/SLIs, and incident detection across hybrid environments (cloud-native and on-prem).

  • Partner with application and infrastructure teams to enable self-service observability capabilities, accelerating troubleshooting and reliability improvements.

Log Management
  • Build and maintain enterprise-scale log management capabilities within the observability platform.

  • Evolve log management to serve as a scalable, cost-effective alternative to traditional log aggregation solutions.

  • Partner with security and infrastructure teams to ensure logging meets performance, compliance, and retention requirements.

Incident Response & Collaboration
  • Integrate observability solutions with incident management platforms such as Pager Duty to streamline escalation, response, and workflow automation.

  • Oversee and optimize on-call processes, ensuring alerts are actionable, routed effectively, and resolved quickly.

  • Provide real-time telemetry during incidents and support root cause analysis (RCA) backed by observability data.

Automation & Integration
  • Develop automation using Python, Terraform,…

Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary