×
Register Here to Apply for Jobs or Post Jobs. X

Manager, Site Reliability Engineering

Job in Oakland, Alameda County, California, 94616, USA
Listing for: Ccgmag
Full Time position
Listed on 2025-12-22
Job specializations:
  • IT/Tech
    Cloud Computing, SRE/Site Reliability, Systems Engineer, IT Project Manager
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Position Title

Manager, Site Reliability Engineering

From Fivetran founding until now, our mission has remained the same: to make access to data as simple and reliable as electricity. With Fivetran, customer data arrives in their warehouses, canonical and ready to query, with no engineering or maintenance required. We’re proud that more organizations continue to leverage our technology every day to become truly data‑driven.

About the Role

Fivetran is building data pipelines to power the modern data stack for thousands of companies.

As a Manager of Site Reliability Engineering, you will take on the responsibility for the Serbia‑based group of SRE Engineers. Together with other SRE managers and engineers in Ireland, India, and the US, you will take ownership of the reliability of Fivetran’s service, including building and monitoring repeatable infrastructure, reliability, and robustness of the continuously deployed release pipeline, as well as timely and effective incident response and resolution.

You will co‑own the responsibility for the scalability and reliability of Fivetran’s connector infrastructure on AWS, GCP, and Azure. You will bring together and grow a Serbia‑based team that reliably delivers excellent results while maintaining a culture of strong collaboration, engagement, and continuous improvement.

This is a full‑time position based out of our Novi Sad office. Our hybrid work model offers a blend of remote flexibility and in‑person collaboration, including two days in the office each week to connect and build as a team.

Technologies You’ll Use
  • AWS, Azure, Google Cloud Platform (GCP)
  • EKS, AKS, GKE (managed services)
  • Buildkite, ArgoCD
  • Postgre

    SQL, Cloud Datastore
  • Go, Java
  • Python, Shell
  • Terraform, Pulumi
  • FastAPI (RESTful APIs)
  • Private Links (AWS, Azure), Private Service Connect (GCP), site‑to‑site VPNs across major cloud providers
  • Grafana
What You’ll Do Leadership and Talent Management
  • Build, hire, and plan the growth of the Serbia‑based SRE organization
  • Help engineers advance in their careers;
    Actively guide and coach them
  • Set clear expectations and create a positive work environment based on accountability
  • Establish strong global and cross‑team relationships with product, field, software teams, and the other SRE teams around the world
SRE Subject Matter Expertise
  • Drive initiatives that improve service reliability, scalability, and performance through automation, observability, and proactive problem‑solving
  • Advocate for simple, elegant, and easily scalable system design
  • Support new services before they go live through activities such as system design consulting/review, capacity planning, and launch reviews
  • Ability to be hands‑on and willing to act as player‑coach in SRE areas such as IaC, Observability & Alerting, and Release Management
  • Demonstrate strong accountability for infrastructure cost management
  • Optimize our continuous integration and deployment process, striving for safe, frequent, and automated releases
  • Oversee incident management practices, ensuring timely response, effective/blameless postmortems, and systemic improvements
  • Stay current with emerging technologies, tools, and industry best practices relevant to reliability engineering
Skills Were Looking For
  • Experience in managing or leading a Site Reliability Engineering (SRE), Dev Ops, or Infrastructure Engineering team operating in a public cloud at scale
  • Demonstrate significant working knowledge of Continuous Integration and Deployment processes and tooling
  • Proven experience in cloud‑based infrastructure design and IaC
  • Strong understanding and experience in security control design, implementation, and operations
  • Solid technical working experience on AWS, GCP, or Azure, distributed systems, networking, and container orchestration (Kubernetes)
  • Deep understanding of reliability concepts, including monitoring/observability, capacity planning, and disaster recovery
  • Experience leading incident response, root cause analysis, and reliability‑focused postmortems
  • Familiarity with cost optimization strategies in large‑scale cloud environments
  • Excellent leadership, communication, and stakeholder management skills
  • Ability to iterate in the context of an evolving…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary