Principal Software Development Engineer - Observability
Listed on 2026-06-13
-
Software Development
Cloud Engineer - Software, DevOps, Software Engineer, Software Architect
Principal Software Engineer, Observability
Our Technology Team partners with teams across Expedia Group to create innovative products, services, and tools that deliver high‑quality experiences for travelers, partners, and our employees. As a Principal Engineer, you will be part of an agile development team with deep expertise in cloud, distributed systems, and observability. You will play a pivotal role in shaping the strategic technical goals for our group, leading the architecture, design, and implementation of a centralized, scalable, and cost‑effective observability platform used by all engineering teams across Expedia.
Responsibilities- Architect and build core telemetry pipelines for logs, metrics, and traces, evolving the platform to handle a 10x increase in data volume while maintaining performance and cost‑effectiveness.
- Drive Open Telemetry adoption, spearheading strategy, rollout, and support for the Open Telemetry collector across thousands of services.
- Implement platform governance and optimization, designing capabilities for data governance, cost allocation, and resource management, and defining SLOs for the platform.
- Elevate the practice of observability by unifying tooling (Grafana, Datadog, Splunk), documentation, and service lifecycle management within the internal developer portal.
- Automate infrastructure lifecycle with IaC using Terraform and/or Crossplane, eliminating manual toil through automated cluster provisioning and incident remediation workflows.
- Provide technical leadership and mentorship, leading architecture review sessions, authoring RFCs, and mentoring senior engineers on the team.
- Serve as the final escalation point for complex, cross‑cutting production incidents related to the observability platform.
- Collaborate and innovate across a wide variety of technologies and tools, such as Go, Java, Python, AWS, Kubernetes, Open Telemetry, Prometheus, and more.
- Bachelor’s or Master’s degree in Computer Science or a related technical field, or equivalent practical experience.
- 10+ years of experience in software engineering, focusing on building and operating large‑scale distributed systems, infrastructure automation, or configuration management.
- Deep expertise in observability principles and the three pillars: logs, metrics, and traces.
- Strong hands‑on proficiency with observability technologies such as Prometheus, Grafana, Datadog, Splunk, and Open Telemetry.
- Proficiency in one or more of Go, Java, or Python.
- Solid understanding of cloud‑native architectures (Kubernetes, Docker, microservices) and major cloud platforms, with AWS preferred.
- Experience designing, building, and operating highly available, scalable, and resilient platforms.
- Excellent hands‑on coding skills with the ability to balance architectural breadth and depth.
- Clear communicator, able to concisely explain complex technical details to diverse audiences.
- Creative problem‑solving using data and insights to support recommendations and influence decisions.
- Experience mentoring senior engineers and establishing standards for operational excellence and code quality at a multi‑project level.
San Jose: $249,000 – $348,500 (potential up to $398,500 based on performance).
Seattle: $231,000 – $323,500 (potential up to $369,500 based on performance).
Expedia Group is committed to creating an inclusive work environment with a diverse workforce. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, veteran status, or any other characteristic protected by law. This employer participates in E‑Verify. The employer will provide the Social Security Administration (SSA) and, if necessary, the Department of Homeland Security (DHS) with information from each new employee's I‑9 to confirm work authorization.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).