Senior Hardware Development Engineer AWS AI & ML, Accelerator Servers
Listed on 2026-05-30
-
Engineering
Systems Engineer, Hardware Engineer, Electrical Engineering
Senior Hardware Development Engineer AWS AI & ML, Accelerator Servers
Job | Amazon Web Services, Inc.
AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. In other words, we’re the people who keep the cloud running. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. We work on the most challenging problems, with thousands of variables impacting the supply chain — and we’re looking for talented people who want to help.
You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, operations managers, and other vital roles. You’ll collaborate across AWS to deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. You will experience an inclusive culture that welcomes bold ideas and empowers you to own them to completion.
Our team designs, builds and operates Amazon's fleet of Accelerated Servers using Internal Amazon design silicon or specialized purpose accelerators (EC2.TRN, INF, G, F + more instance types). We solve systemic hardware issues and we build hardware and software systems to detect and mitigate future recurrences so that our customers can experience the highest quality of service possible.
You will architect, design, and own a new segment of accelerated servers for the AWS fleet. This includes defining board‑level architecture, component selection, and managing manufacturing partnerships through development and production. You will make critical design decisions on thermal, power, signal integrity, and mechanical integration while leading cross‑functional teams from concept through data center deployment. You will define requirements and conduct technical reviews to ensure designs meet AWS standards.
Your designs will power AWS infrastructure supporting internal Amazon design silicon and specialized purpose accelerators across EC2 instance types. You will design for reliability and manufacturability, incorporating lessons learned from fleet operations into your architecture decisions. Your designs will include built‑in diagnostics and telemetry to enable efficient validation and operations.
Key Job Responsibilities- Design and Architecture: Own server architecture, board design, component selection, thermal and power design, and ODM technical reviews. Make trade‑off decisions balancing performance, cost, and manufacturability. Lead design reviews with manufacturing partners and ensure designs scale to production volumes.
- Fleet Operations: Monitor production quality, analyze field data to inform future designs, and drive continuous improvement. Collaborate with operations teams to ensure your designs meet reliability targets in production environments.
Making design decisions—defining technical requirements, conducting design reviews with manufacturing partners, selecting components, and architecting thermal and power solutions. Interfacing with customers to translate requirements into technical specifications and working with manufacturing partners to ensure designs scale to production. Collaborating with interdisciplinary teams including component engineers, firmware developers, test engineers, and integration specialists to deliver complete server solutions.
Basic Qualifications- Experience in developing functional specifications, design verification plans and functional test procedures
- Experience working with interdisciplinary teams to execute product design from concept to production
- Experience in server technologies such as thermal, mechanical, power, and signal integrity
- Bachelor's degree in electrical engineering or equivalent
- Master's degree in electrical engineering, computer engineering, or equivalent
- Experience with the project management of technical projects
- 7+ years of server, storage, networking, or large‑scale distributed systems experience
- 10 years hardware development with a focus on system/server…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).