Sr. Director - Backend Engineering
Listed on 2026-01-05
-
IT/Tech
AI Engineer, Cloud Computing
Sr. Director Back-end Engineering – AI Infrastructure Orchestration
Company Introduction
We exist to wow our customers. We know we’re doing the right thing when we hear our customers say, “How did we ever live without Coupang?” Born out of an obsession to make shopping, eating, and living easier than ever, we’re collectively disrupting the multi-billion-dollar e-commerce industry from the ground up. We are one of the fastest-growing e-commerce companies that established an unparalleled reputation for being a dominant and reliable force in South Korean commerce.
We are proud to have the best of both worlds — a startup culture with the resources of a large global public company. This fuels us to continue our growth and launch new services at the speed we have been since our inception. We are all entrepreneurs surrounded by opportunities to drive new initiatives and innovations. At our core, we are bold and ambitious people that like to get our hands dirty and make a hands‑on impact.
At Coupang, you will see yourself, your colleagues, your team, and the company grow every day.
Our mission to build the future of commerce is real. We push the boundaries of what’s possible to solve problems and break traditional tradeoffs. Join Coupang now to create an epic experience in this always‑on, high‑tech, and hyper‑connected world.
Role Overview
Strategy and Leadership
- Define and execute the long‑term vision and roadmap for the company’s AI infrastructure orchestration layer, aligning it with overall business and AI Services goals.
- Lead, mentor, and grow a high‑performing engineering and operations team focused on AI infrastructure, and platform engineering.
- Manage budget and resource allocation for AI infrastructure delievrables.
- Act as a key liaison between AI Infra and other services owners and consumers, core engineering, Cloud infrastructure, and executive leadership.
- Oversee the design, implementation, and maintenance of the core orchestration platforms for large‑scale AI model training (e.g., distributed training, hyperparameter tuning) and deployment (e.g., containerization, serverless functions, edge deployment).
- Ensure reliability, security, and compliance of the AI infrastructure, meeting strict standards for data governance and model integrity.
- Establish Service Level Objectives (SLOs) and Key Performance Indicators (KPIs) for the AI platform services and lead efforts for continuous optimization and performance tuning.
Success Metrics
A successful Senior Director - AI Infrastructure Orchestration will be measured by:
- The time‑to‑market for AI Infar build, scale and operate
- The resource utilization rate and cost efficiency of the AI compute infrastructure.
- The reliability and uptime of the core AI platform services.
- The talent retention and development within the AI Infrastructure team.
- Select, evaluate, and integrate the core technologies required for the AI stack (e.g., Kubernetes, Kubeflow, Ray, ML frameworks, GPU/accelerator management, distributed file systems).
- Champion infrastructure‑as‑code (IaC) principles to manage and provision AI resources consistently and at scale.
Education: Bachelor's or Master’s degree in Computer Science, Engineering, or a related technical field.
Experience:
- 15+ years of progressive experience in software engineering, infrastructure, or platform operations.
- 5+ years of experience leading and managing technical teams, ideally in a Director or Sr. Director level or equivalent capacity.
- Deep, hands‑on experience designing and operating large‑scale distributed systems and cloud‑native architectures
- Proven experience specifically with AI infrastructure orchestration (e.g., using Kubernetes, Kubeflow, or similar MLOps tools) and managing accelerated compute resources (GPUs, TPUs etc).
- 15+ years of Cloud backend engineering, Cloud Design, Deployment, Dev Ops
- 15+ years of experience leading system design, architecture leveraging Private Clouds and AWS and/or Azure/ GCP.
- 10+ years of demonstrable building and operating infrastructure as code. Infra Automation, Comfortable with many flavors of Linux
- 15+ year’s of…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).