GPU Stack Build Architect
Listed on 2026-05-15
-
Software Development
AI Engineer, DevOps, Software Engineer, Cloud Engineer - Software
WHAT YOU DO AT AMD CHANGES EVERYTHING
At AMD, our mission is to build great products that accelerate next‑generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture.
We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.
AMD’s AI software stack is moving fast — and keeping pace means shipping complete, validated GPU stack releases to customers as quickly as the software can evolve. Today, that release velocity is limited by the coordination overhead between the layers of the stack: firmware, kernel driver, and ROCm each have their own build systems, their own workflows, and no shared baseline.
Every release requires manual effort to assemble and validate a coherent recipe across all three.
- Make foundational build system architecture decisions — the super-build is greenfield. You’ll determine how 65+ firmware components, the kernel driver, and ROCm are structured, how dependencies are expressed and resolved, and how the system scales as more components onboard. These decisions matter and they're yours to make.
- Lead firmware recipe migration using AI‑assisted workflows — the existing firmware builds are scattered across multiple CI systems with no single source of truth. You’ll reverse‑engineer what exists, understand the dependencies, and convert those recipes into the unified build — using agentic AI coding tools to move at a pace that would otherwise take years. Strong opinions about package management and host dependency handling are a real advantage here.
- Build repo automation that keeps the super‑repo sane — with 65+ component repos feeding into a unified system, manual dependency updates don't scale. You’ll map the full repo landscape, design the submodule/manifest architecture, and build the automation that keeps versions in sync without constant human intervention.
- Unblock new team members from day one — the tiger team is actively growing and new engineers are blocked until they have a working dev environment. You’ll stand up replicable, documented machine setups that solve the network access, firewall, and cloud quota constraints so onboarding stops being a bottleneck.
- 10+ years of software engineering experience with deep focus on build systems
- Strong hands‑on coding ability — this is an IC lead role, not a managing‑from‑above role
- Expert‑level knowledge of build tools (cmake, ninja, Bazel, or equivalent) and build system design at scale
- Experience with packaging, host dependency management, and toolchain configuration
- Track record of modernizing or architecting a build system used by 100+ developers
- Strong understanding of version control and dependency management at scale (git submodules, manifest‑driven workflows, etc.)
- Fluency with agentic AI workflows (Cursor, Claude, Copilot, etc.) as a force multiplier for engineering throughput
- Experience with firmware or kernel build systems (embedded firmware, Linux kernel, or similar)
- Familiarity with Git Hub Actions and CI/CD pipeline design
- Experience building in or migrating to cloud‑hosted runner environments (AWS)
- Sharpen your agentic AI engineering skills — the scope of this work (65+ firmware components, greenfield architecture, 1,000+ eventual users) means you'll be using AI coding agents as a core part of your workflow every day, not occasionally. Reverse‑engineering build recipes, generating dependency graphs, scaffolding build configurations — this is exactly the kind of high‑volume, pattern‑rich work where agentic AI makes the difference between a months‑long slog and a fast, iterative build.
You'll leave with a depth of experience in AI‑assisted engineering that is hard…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).