Principal Software Development Engineer - Cloud Platform
Listed on 2026-05-20
-
Software Development
Cloud Engineer - Software, DevOps, Software Engineer
Expedia Group brands power global travel for everyone, everywhere. We design cutting‑edge tech to make travel smoother and more memorable, and we create groundbreaking solutions for our partners. Our diverse, vibrant, and welcoming community is essential in driving our success.
Why Join Us?
To shape the future of travel, people must come first. Guided by our Values and Leadership Agreements, we foster an open culture where everyone belongs, differences are celebrated and we know that when one of us wins, we all win.
We provide a full benefits package, including exciting travel perks, generous time‑off, parental leave, a flexible work model (with some pretty cool offices), and career development resources, all to fuel our employees' passion for travel and ensure a rewarding career journey. We’re building a more open world. Join us.
Principal Software Development Engineer
Our Technology Team partners with teams across Expedia Group to create innovative products, services, and tools to deliver high‑quality experiences for travelers, partners, and our employees. A singular technology platform powered by cloud and data provides secure, differentiated, and personalized experiences that drive loyalty and traveler satisfaction.
We are looking for a Principal Engineer to serve as the technical architect for our Cloud Platform organization which sits within our Technology division. As a Principal Engineer reporting to the VP of Cloud Platform, you will be the primary architect of our technical future. The Cloud Platform organization provides the secure, scalable cloud infrastructure, runtime platforms, and developer experience tooling that enable teams across Expedia Group to build, deploy, and operate high‑quality, resilient software quickly and safely.
We are seeing an explosion in code volume and service complexity. The goal for this role is to build a platform that can handle this growth without sacrificing reliability or skyrocketing our cloud bill. You’ll be responsible for making sure our architecture is composable, our developer tools are agentic, our Kubernetes footprint is efficient, and our observability stack provides signals, not just noise.
Responsibilities- Lead Architectural Evolution: Own the move toward a Cell‑Based Architecture and shift from fragile, monolithic clusters to isolated, predictable failure domains that allow us to scale horizontally with confidence.
- Modernize Kubernetes & Infrastructure: Define our K8s strategy, focusing on multi‑cluster management, service mesh, and automated scaling, ensuring our "Golden Path" makes it easy for engineers to do the right thing by default.
- Hardened Reliability & Observability: Set standards for SRE across the org, advancing beyond basic dashboards to causal observability, automated incident response, and rigorous SLO/SLI management.
- Optimize Cloud Economics: Lead our Fin Ops technical strategy, building tooling and visibility to understand cost‑per‑service and align infrastructure spend with business value.
- Support the Developer Workflow: Build agent‑friendly infrastructure, including standardized Dev Containers and ephemerally‑created environments for fast, isolated iteration without clobbering shared state.
- Extensive professional software development experience designing, building, and operating large‑scale, cloud‑native distributed systems and platform services on Kubernetes.
- Proven ownership of critical services or multi‑service platforms, including responsibility for system design (LLD), API design, data modeling, deployment, and ongoing operational health.
- Deep expertise with at least one major public cloud provider and core platform technologies (compute, networking, storage, service discovery, security, observability, and CI/CD).
- Demonstrated ability to make high‑impact architectural decisions, navigate complex trade‑offs, and guide multiple teams toward coherent, long‑term technical direction.
- Familiarity with AI‑driven systems, tools, or workflows and applying AI/ML concepts to real‑world products within cloud or platform environments.
- Deep knowledge of observability patterns (Open Telemetry, Prometheus, distributed…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).