×
Register Here to Apply for Jobs or Post Jobs. X

Principal Networking Engineer - QoS​/Networking

Job in Santa Clara, Santa Clara County, California, 95053, USA
Listing for: Advanced Micro Devices
Full Time position
Listed on 2026-02-23
Job specializations:
  • IT/Tech
    Systems Engineer, IT Support, Cybersecurity
Salary/Wage Range or Industry Benchmark: 130000 - 180000 USD Yearly USD 130000.00 180000.00 YEAR
Job Description & How to Apply Below
Position: Principal Networking Engineer - QoS / Networking

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture.

We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

THE ROLE:

We are seeking a hands-on Principal Networking Engineer to own end-to-end QoS strategy and implementation across data center Smart

NICs/DPUs. You will define traffic classification, shaping, scheduling, and congestion control policies spanning Top-of-Rack (ToR)/leaf/spine switches and host offload (Smart

NIC/DPU), ensuring predictable performance for AI/ML, storage, and latency-sensitive services. The ideal candidate combines deep knowledge of L2/L3/L4 QoS, RDMA/RoCE, PFC/ETS/ECN, and switch silicon schedulers/queues, with practical experience deploying policies at fleet scale.

THE PERSON:

We are seeking an experienced Principal Networking Engineer to drive the continuation of existing and future software systems and products. The successful candidate will be responsible for ensuring the functionality, reliability, and performance of our software products while keeping an outlook for future enabling and related technology. The ideal candidate will have a strong background in software engineering, excellent technical skills, and communication skill.

KEY RESPONSIBILITIES:
  • Own QoS architecture across network tiers (host → NIC/DPU including classification, policing, shaping, queue mapping, and scheduling strategies for mixed workloads (AI collectives, storage, RPC, control plane).
  • Design and implement Smart

    NIC QoS: map DSCP/PCP to NIC traffic classes, configure hardware TX/RX queues, rate limiters, WFQ/DRR schedulers, and offload paths for RDMA/TCP/UDP.
  • Switch QoS policy design: configure PFC, ETS, ECN/RED/WRED, buffer pools, queue thresholds, shared vs. dedicated buffers, and congestion control across multiple ASICs (e.g., Broadcom, NVIDIA/Mellanox, Marvell).
  • RDMA/RoCE tuning end-to-end: lossless/loss-tolerant modes, CNP/ECN parameters, RNR/retry behavior, MTU/Jumbo frames, and scalable multi-tenant profiles.
  • Performance engineering: build test plans and run micro/macro benchmarks (e.g., _lat/_bw, RCCL/NCCL, iperf, switch counters/telemetry) to validate latency, throughput, tail performance, and fairness.
  • Instrumentation & observability: define SLI/SLOs for QoS (tail latency, drops, PFC events, ECN marks, queue depth, buffer occupancy); integrate with streaming telemetry (gNMI/INT/SFlow) and develop dashboards and alerts.
  • Troubleshoot complex incidents: incast, PFC deadlocks, microbursts, head-of-line blocking, unfair scheduling, and noisy neighbors; lead root-cause analysis and corrective actions.
  • Scale & automation: deliver declarative QoS via intent-based configs and CI/CD (e.g., Ansible/Salt, NAPALM, gNMI/gNOI, Netconf/YANG), including pre-deployment simulation and automated canary/rollback.
  • Documentation & standards: author design docs, runbooks, and guidance for tenant teams; contribute to internal standards and vendor requirements.
MINIMUM QUALIFICATIONS:
  • Strong experience datacenter networking or systems engineering, with direct ownership of QoS on switches and/or Smart

    NICs/DPUs.
  • Deep knowledge of QoS mechanisms: classification/marking (DSCP/PCP), policing, shaping, queueing (PRIO, WRR/WFQ/DRR), scheduling hierarchies, and buffer management.
  • Hands-on with PFC, ETS, ECN/WRED, explicit buffer tuning, and RDMA/RoCE performance/correctness in production.
  • Experience configuring merchant switch silicon (e.g., Broadcom Trident/Tomahawk, NVIDIA Spectrum, Marvell Teralynx) via NOS CLIs/SDKs (e.g., SONiC, Cumulus, NX-OS, EOS, Onyx).
  • Smart

    NIC/DPU experience (e.g., NVIDIA Blue Field, Intel…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary