Engineering Manager, AI Inference Systems
Listed on 2026-06-17
-
Software Development
AI Engineer (Applied/Software), Machine Learning/ ML Engineer
About The Team
The Applied AI team safely brings OpenAI's technology to the world. They have released ChatGPT, Plugins, DALL
· E, and the APIs for GPT-4, GPT-3, embeddings, and fine-tuning. They also operate inference infrastructure team seeks to learn from deployment and distribute the benefits of AI, while ensuring that this powerful tool is used responsibly and safely. Safety is paramount, even over unfettered growth. They serve end-users directly through ChatGPT and developers through their APIs, which power product features never before possible.
The Role
Model inference at OpenAI is powered through a single service called "Engine", which wraps the PyTorch transformers for GPT-4 and ChatGPT. OpenAI is looking for an engineering manager to help lead critical work for this service and grow the team.
In This Role, You Will- Own substantial portions of our inference stack
. - Ensure the ability to run GPT-4, ChatGPT, and future models at increasingly high scale with increasing efficiency.
- Hire world‑class AI systems engineers in one of the most competitive hiring markets.
- Coordinate the inference needs of OpenAI's teams and products.
- Create a diverse, equitable, and inclusive culture that makes all feel welcome while enabling radical candor and the challenging of group think.
- Have 3+ years of experience in engineering management and 7+ years as an IC working with high scale distributed systems and ML systems.
- Have experience with ML systems
, particularly high scale distributed inference for modern LLMs. - Have experience with highly available, reliable, production grade systems at scale.
- Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.
- Care deeply about diversity, equity, and inclusion, and have a track record of building inclusive teams.
- Have experience closing extremely competitive candidates for your team, and the ability to craft and convey compelling visions of the future.
- Have a voracious and intrinsic desire to learn and fill in missing skills—and an equally strong talent for sharing learnings clearly and concisely with others.
- Are comfortable with ambiguity and rapidly changing conditions. You view changes as an opportunity to add structure and order when necessary.
As technical context: at the heart of OpenAI's infrastructure is a large-scale deployment of GPU nodes running in dozens of Kubernetes clusters across regions. Some core technologies they build with include Python, PyTorch, CUDA, Triton, Redis, Infiniband, NCCL, NVLink
.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).