SENIOR Silicon DESIGN Engineer: Hardware-Software Co-Design, GPU Compute and AI Compilers & Runtimes
Listed on 2026-05-09
-
IT/Tech
Systems Engineer, Hardware Engineer, AI Engineer
Overview
At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture.
We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.
The AMD AI Group is looking for a Senior Silicon Design Engineer to co‑design the hardware‑software interface for AMD Instinct™ GPUs. You will work with hardware architects to model and evaluate how GPU microarchitecture, AI frameworks, and the ROCm™ software stack interact, using that analysis to drive improvements in both performance and energy efficiency across the AI compute stack.
Your Day‑to‑DayBuild performance and power models that connect architectural decisions to real workload behavior. Evaluate proposed hardware features and software API designs against AI training and inference targets. Write and optimize GPU kernels and compiler passes that exploit microarchitectural characteristics. Work across the stack: the Linux kernel DRM/GEM layer, the ROCm™ runtime, the IREE compiler, and AI frameworks and DSLs like PyTorch and Triton.
Key Responsibilities- Hardware‑Software Co‑Design & Performance
- Leverage detailed performance models that bridge GPU microarchitecture and software execution, including instruction execution, memory hierarchy behavior, networking, and compute‑memory overlap. You will use those models to drive both software optimization and hardware design feedback.
- Analyze and characterize AI workloads (training and inference) against AMD Instinct™ microarchitecture to identify performance bottlenecks, inform silicon roadmap decisions, and validate that architectural features deliver expected real‑world speedups.
- Collaborate directly with AMD GPU hardware and firmware teams to evaluate proposed hardware features, providing software‑grounded analysis of how changes to the memory subsystem, cache coherence protocols, or execution units affect workloads that matter.
- Own the feedback loop between workload characterization and hardware design: build the methodology, tooling, and benchmarks that make this loop fast and evidence‑driven across Instinct generations.
- Compiler & Runtime Infrastructure
- Contribute to the IREE compiler and ROCm™ stack, with focus on backend optimizations, runtime API extensions, and firmware command packet design that target AMD GPU microarchitecture.
- Design and optimize compute runtime paths including AI framework dispatch, data movement, and pipelining, minimizing overhead and maximizing hardware utilization for latency‑sensitive and throughput‑critical workloads.
- Work at the Linux kernel level on GPU compute infrastructure to improve asynchronous execution and reduce latency.
- Cross‑Cutting
- Contribute to the open ROCm™ ecosystem, across the stack, in ways that raise the bar for the entire AMD AI developer experience.
- Develop reproducible benchmarks and performance regression infrastructure that keeps the full stack honest across compiler, runtime, and driver changes.
- Represent the software perspective in silicon architecture reviews and represent hardware realities to the software and framework teams. Be the person who holds both models in their head at once.
- Collaborate with AMD architecture teams to provide software feedback on next‑generation Instinct™ GPU designs for both training and inference workloads.
- 5+ years of industry experience working at the boundary of GPU compute hardware and AI systems software.
- Strong computer architecture foundation: memory consistency and coherence models, GPU microarchitecture, pipelining, and the ability to reason precisely about what hardware is doing on a given workload.
- Demonstrated performance modeling skills and the ability to build…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).