Principal Architect, AI Networking
Listed on 2026-05-31
-
Software Development
AI Engineer, Software Engineer, Machine Learning/ ML Engineer
An applied research team within NVIDIA's Networking Systems & Software Architecture group is solving some of AI's hardest infrastructure problems. The team builds systems-level software that moves data between GPUs, nodes, and storage at the speed modern AI demands-spanning low-level transport optimization, hardware-software co-design, and communication frameworks that plug directly into production AI stacks. The team's charter expands into emerging domains including quantum computing interconnects.
WhatYou Will Be Doing
- Setting the long-term technical vision for distributed AI communication systems - GPU-to-GPU, GPU-to-storage, and cross-node data movement.
- Conducting original research and prototyping next-generation networking solutions over RDMA, NVLink, and GPUDirect.
- Driving hardware-software co-optimization with GPU, DPU, NIC, and network switch. Investigating fundamental bottlenecks in communication runtimes for large-scale AI workloads (KV cache transfer, disaggregated prefill/decode, model parallelism).
- Integrating networking capabilities into AI serving stacks such as vLLM, SGLang, and Tensor
RT-LLM. - Publishing findings, representing NVIDIA in industry forums and standards bodies, and mentoring senior engineers across the organization.
- 15+ years in systems software and/or networking with deep expertise in high-performance networking (Infini Band, RoCE, RDMA, NVLink), communication libraries (e.g. NIXL, NCCL, UCX, MPI, NVSHMEM), and GPU accelerated systems, with track record of defining and delivering complex, cross-team technical initiatives from research concept to production.
- MS, PhD or equivalent experience in Computer Science, Computer Engineering, Electrical Engineering, or a related field.
- Deep understanding of computer architecture, memory hierarchies, DMA engines, and OS-level networking.
- Understanding of ML systems concepts - transformer architectures, KV cache mechanics, model parallelism, or distributed training and inference patterns.
- Proficiency in programming languages such as C, C++, Rust and Python.
- Knowledge of ML inference frameworks (vLLM, SGLang, Tensor
RT-LLM) and their communication requirements. - CUDA programming and NVIDIA GPU architecture expertise.
- Proved experience influencing product strategy and technical roadmap at a senior level.
- Major open-source contributions.
Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 272,000 USD - 431,250 USD. You will also be eligible for equity and benefits.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).