Director Site Reliability Engineering Job Fremont area,California USA,IT/Tech

About Movius

At Movius, we solve a critical gap companies face with employee-to-client communication over voice and messaging. We are the leading global provider of Secure Communication as a Service (SCaaS™). Our flagship solution, Multi Line™, enhances workflows, resolves compliance gaps, and unifies cross‑channel messaging. Movius AI‑powered solutions enable businesses to build strong, lasting customer relationships in a company‑owned, controllable system. Welcome to Phone 3.0™.

Headquartered in Alpharetta, GA, with offices in New York, Silicon Valley, Bangalore, and London, Movius partners with leading carriers like T‑Mobile, Vodafone, TELUS, BT, Singtel, and more. Learn more ius.ai.

Director, Site Reliability Engineering

Role Overview

We are seeking a Director of Site Reliability Engineering (SRE) to lead the reliability, scalability, and operational excellence of our Mobile‑first SIP‑based communications SaaS platform
. This platform supports mission‑critical voice, messaging, and unified communications services used by highly regulated global enterprise customers.

The Director of SRE will be responsible for ensuring carrier‑grade reliability, performance, and security of our distributed multi‑cloud infrastructure while building and leading a high‑performing SRE organization. This role partners closely with Engineering, Product, Security, and Customer Experience to deliver resilient, scalable, and observable systems.

The ideal candidate combines deep technical expertise in real‑time communications infrastructure with strong leadership and operational discipline.

Key Responsibilities Reliability & Platform Operations

Own availability, reliability, and performance of the communications SaaS platform supporting voice, SMS/RCS/MMS, SIP signaling, and mobile services.
Define and manage SLOs, SLIs, and error budgets for mission‑critical services.
Drive operational excellence through incident management, post‑mortems, and continuous improvement.
Ensure 99.99%+ service availability for carrier and enterprise customers.

Communications Infrastructure

Oversee reliability of SIP signaling infrastructure, SBCs, media servers, messaging gateways, and telecom interconnects.
Ensure stability and scaling of real‑time voice and messaging workloads across distributed multi‑cloud environments.
Collaborate with telecom partners and carriers to maintain high service quality and interconnect reliability.

Cloud & Platform Engineering

Lead reliability engineering across multi‑region multi‑cloud infrastructure (AWS and/or IBM cloud).
Build highly available architectures with geo‑redundancy, active‑active deployments, and automated failover.
Drive infrastructure‑as‑code, automation, and self‑healing systems.

Observability & Monitoring

Establish best‑in‑class monitoring, alerting, tracing, and observability frameworks.
Implement deep telemetry for call quality, SIP performance, messaging delivery, and system health.
Use data‑driven insights to improve system resilience and operational response.

Incident & Crisis Management

Lead 24/7 operational readiness including on‑call processes and war room coordination.
Define incident severity models, response playbooks, and escalation frameworks.
Conduct blameless post‑incident reviews and drive systemic improvements.

Security & Compliance

Partner with security teams to ensure platform resilience against fraud, abuse, and telecom‑specific threats.
Maintain compliance with telecom and enterprise security standards.

Team Leadership

Build and scale a world‑class SRE organization across multiple regions.
Mentor senior engineers and technical leaders.
Drive a culture of ownership, reliability, and operational excellence.

Cross‑Functional Collaboration

Work closely with software engineering, product and customer experience teams.
Influence architecture decisions to ensure systems are operable, scalable, and resilient.

Required Qualifications

10+ years of experience in site reliability engineering, cloud infrastructure, or platform operations.
5+ years of leadership experience managing SRE or infrastructure teams.
Strong expertise in real‑time communications systems, including:

SIP signaling
SBCs
Media infrastructure
VoIP…


Increase/decrease your Search Radius (miles)



Job Posting Language