Principal SRE Engineer Job Reston area,Virginia USA,IT/Tech

Overview

At Movius, we solve a critical gap companies face with employee-to-client communication over voice and messaging. We are the leading global provider of Secure Communication as a Service (SCaaS™). Our flagship solution, Multi Line™, enhances workflows, resolves compliance gaps and unifies cross-channel messaging. Movius AI-powered solutions enable businesses to build strong and lasting relationships with their customers in a company-owned, controllable system.

Welcome to Phone 3.0™.

Headquartered in Alpharetta, GA, with offices in Silicon Valley, Bangalore, India, New York, and London, Movius partners with leading global wireless carriers like T-Mobile, Vodafone, TELUS, BT, Singtel & more. To learn more about Movius, visit (Use the "Apply for this Job" box below)..

Responsibilities

Maintain architecture blueprints, playbooks, and templates for SLOs, postmortems, and change management.
Lead the end-to-end SRE architecture and define technical, reliability, and automation standards.
Drive the SRE roadmap aligned with business SLAs, platform goals, and cloud strategy (AWS preferred).
Serve as reliability authority in design reviews and architecture boards.
Architect and manage full-stack observability (Elastic Stack, Open Telemetry, Prometheus) with integrated traces, metrics, and logs.
Define and automate SLO/SLI tracking, error budgets, and incident management life cycles.
Build event-driven, self-healing systems and automate infrastructure, deployments, and monitoring.
Optimize distributed systems, Kubernetes workloads, and microservices for scale, performance, and cost.
Lead chaos engineering, root cause analysis, and continuous improvement to reduce MTTR.
Mentor engineers in reliability, automation, and architecture best practices; champion an automation-first culture.

Education & Experience

Bachelor’s or Master’s degree in Computer Science, IT, or equivalent experience.
15+ years in Dev Ops, Infrastructure, or SRE roles.
4+ years in a senior/principal-level capacity driving SRE strategy and automation.
Proven success designing and scaling large distributed, cloud-native platforms.
Telecom domain experience is a plus.

Technical Expertise

Deep knowledge of AWS (EKS, EC2, RDS, IAM, VPC, Kafka, Cloud Watch, API Gateway, Lambda, WAF, KMS).
Helm Chart mastery and container orchestration (EKS).
Hands-on experience with Elastic APM and observability tools.
Expert in Terraform, Jenkins, Bitbucket, and Python/Bash/Go scripting.
Strong grasp of SLO/SLI frameworks, error budgets, and AIOps.
Experience in chaos engineering, performance optimization, and resilience testing.
Excellent documentation and system design communication skills.

Certifications (Preferred)

AWS Certified Solutions Architect – Professional or Dev Ops Engineer – Professional.
Certified Kubernetes Administrator (CKA) or Application Developer (CKAD).
SRE Foundation, Google SRE, Dynatrace Performance Professional, or Elastic Certified Engineer.

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language