Lead, Site Reliability Engineer
Listed on 2026-05-16
-
IT/Tech
Systems Engineer, Cloud Computing, SRE/Site Reliability, IT Support
Journey with us! Combine your career goals and sense of adventure by joining our exciting team of employees. Royal Caribbean Group is pleased to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world.
We are proud to be the vacation‑industry leader with global brands—including Royal Caribbean International, Celebrity Cruises and Silversea Cruises—the most innovative fleet and private destinations, and the best people. Together, we are dedicated to turning the vacation of a lifetime into a lifetime of vacations for our guests.
The Royal Caribbean Group’s Information Technology Team has an exciting career opportunity for a full‑time Lead, Site Reliability Engineer reporting to the Senior Manager, Site Reliability.
Position is onsite and based in Miramar, Florida.
Position is also not eligible for work authorization sponsorship.
Position SummaryThe Lead, Site Reliability Engineer (SRE) provides technical and strategic leadership for Royal Caribbean Group’s Dev Ops and platform engineering ecosystem. This role defines standards, guides platform architecture, and drives enterprise‑wide initiatives across CI/CD, Kubernetes, Git Ops, observability, security, and AI‑enabled automation to support reliable, scalable software delivery. The engineer will lead platform design and evolution, drive intelligent automation, and ensure robust integration of Dev Ops tooling with business processes, fostering operational excellence and innovation.
EssentialDuties And Responsibilities
- Owns SRE and Dev Ops strategy across AWS and Azure, architecting cloud patterns for high availability, disaster recovery, and cost optimization.
- Leads Kubernetes/Helm platform design and evolution (EKS, AKS) supporting production workloads.
- Drives AI‑assisted SRE capabilities by identifying opportunities for intelligent automation, remediation, and operational insights across CI/CD and platform operations.
- Owns the Git Hub Actions platform, designing reusable workflows and enforcing fully automated end‑to‑end pipelines.
- Mandates Snyk and Sonar Qube in all pipelines, enforcing security gates, quality thresholds, and exemption workflows.
- Integrates Terraform IaC execution directly within CI/CD, ensuring infrastructure changes flow through automated controls.
- Owns Backstage lifecycle, including catalog, scaffolder templates, plugin integrations, and adoption governance.
- Builds Software Templates that pre‑wire CI/CD, Terraform modules, and security tooling for new services from day one.
- Owns pipeline‑to‑Service Now integration, automating change/release records and gating deployments against approved change windows.
- Leads, mentors, and grows a team of SRE and Dev Ops engineers, owning technical escalation and platform SLAs/SLOs.
- Drives engineering culture through blameless post‑mortems, runbooks, documentation, and operational excellence.
- Bachelor’s degree in Computer Science, Engineering, or related field required;
Master’s degree preferred. - 7+ years in SRE/Dev Ops/Platform Engineering, with at least 2+ years in a technical lead or staff‑level role.
- Deep expertise in AWS (EKS, EC2, IAM, Lambda, Cloud Watch) and Azure (AKS, Entra , Azure Monitor).
- Expert in Terraform (modules, remote state, pipeline‑automated execution, Git Ops workflows).
- Advanced proficiency with Git Hub Actions (multi‑job workflows, reusable actions, OIDC, secrets management).
- Production Kubernetes experience (cluster lifecycle, Helm authoring, RBAC, network policies).
- Hands‑on experience with Backstage (catalog config, scaffolder templates, plugin integration, governance).
- Demonstrated Snyk and Sonar Qube pipeline integration with enforced security and quality gates.
- Experience integrating Dev Ops tooling with Service Now change, release, or Digital Release.
- Proven track record reducing deployment lead time, MTTR, or improving platform reliability.
- Hospitality, travel, or high‑volume consumer tech experience.
- AWS Solutions Architect Professional; CKA/CKAD certifications a strong plus.
- Experience with Git Ops tooling (ArgoCD, Flux) and progressive delivery…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).