Lead, Data Center Agentic AI Platform
Listed on 2026-05-29
-
IT/Tech
Systems Engineer, AI Engineer, Cloud Computing, Data Engineer
Description
AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.
Description
AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.
You’ll join a diverse team of design engineers, quality/reliability engineers, supply chain specialists, field engineers, and other vital roles. You’ll collaborate with people across AWS to help us deliver the highest standards for quality and reliability while providing seemingly infinite capacity at the lowest possible cost for our customers. And you’ll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
You'll join a team of Software Development Engineers building an agentic AI platform that serves a broad customer base of design engineers, quality/reliability engineers, supply chain specialists, field engineers, and other vital roles across AWS data center operations. You'll collaborate with people across AWS to help us deliver the highest standards for quality and reliability while providing seemingly infinite capacity at the lowest possible cost for our customers.
And you'll experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
As the Software Development Manager for the Data Center Agentic AI Platform team, you will lead a team of Software Development Engineers building AWS data center's agentic GenAI platform that powers AI-assisted operations across the global data center infrastructure. You will own the technical vision and strategic roadmap for the platform, driving investments across agentic AI systems, full-stack engineering, search and knowledge systems capabilities.
Your leadership will shape the direction of a next-generation AI/ML platform that orchestrates physical work processes, automates decision-making, and enhances operational efficiency for a 30K+ globally distributed user base. You will champion platform thinking building reusable primitives, APIs, and extensible components that dozens of teams across the Data Center Community build upon.
In this role, you will drive the design and delivery of production-grade agentic systems including LLM orchestration, tool-calling patterns, agent frameworks, multi-agent orchestration, and intelligent workflow automation. You will partner closely with cross-functional stakeholders including data center operations, controls engineering, product management, and peer engineering teams to translate complex operational needs into scalable AI-powered solutions. You will establish and raise the bar on engineering practices including code reviews, CI/CD, progressive deployment, observability, and operational readiness for AI systems in production.
You will also own hiring strategy and talent development, building a high-performing engineering team with deep expertise in generative AI, distributed systems, and full-stack development, while communicating platform strategy, technical roadmaps, and business impact to senior leadership with clarity and conviction.
Key job responsibilities
- Lead and mentor a team of SDEs building and operating the Data Center Agentic AI Platform, fostering a culture of ownership, innovation, and operational excellence
- Own the end-to-end technical roadmap for the platform, balancing…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).