Distinguished Engineer,Storage - AI Cloud Job Santa Clara area,California USA,IT/Tech

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world.

Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, you'll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

AI Cloud Data Storage

NVIDIA DGXC Storage org handles some of the fastest training and inference tasks. Every GPU cycle depends on a storage platform built to keep tens of thousands of accelerators continuously busy. It maintains exabytes of data securely and powers the largest AI workloads worldwide across cloud, neocloud, and on-prem setups. With the growth of accelerated computing, storage is essential. It can make the difference between effective GPU use and wasted potential and between launching a frontier model on time or missing the deadline by months.

We seek a Distinguished Engineer to lead NVIDIA's storage strategy for AI Cloud across the Neocloud Provider (NCP) and Cloud Service Provider (CSP) ecosystem. You will direct the architecture of high-performance parallel file systems, object stores, and block storage at exabyte scale. You will stay hands-on, collaborating with engineers, SREs, partners, and storage vendors. You will apply NVIDIA's AI tools to increase your productivity and that of those you impact.

This is a distinctive prospect to establish the storage framework of the AI era at the company that introduced accelerated computing.

What you'll be doing:

* Lead the multi-year technical plan for AI Cloud Storage expansion across NCPs - determine the reference architecture, capabilities, performance and durability SLOs, qualification methodology, and roadmap for the high-performance file, object, and block storage that each NCP must offer to qualify for NVIDIA GPU allocation.

* Serve as the chief storage architect with deep hands-on involvement. Lead key reviews of storage builds and investigate root causes of complex production problems. Develop prototype reference implementations to minimize risks in new initiatives. Make final technical decisions on NCP storage deliveries using measurable SLOs. Apply AI tools heavily to amplify your technical influence throughout the program.

* Define the standard for "production-ready" in NCP storage, including durability and availability SLOs measured in 9s. Ensure sustained efficiency per TiB, observability, blast-radius containment, and reduced operational toil. Influence GPU delivery gating by requiring AI Cloud to accept GPU capacity only after verifying storage-focused ancillary services.

* Develop and guide the architectural direction by working closely with collaborators in training, inference, and accelerated-computing product lines. Coordinate with site-reliability, operations, networking, and security colleagues. Work together with external cloud providers, neocloud operators, and storage vendors to align on a common architecture.

* Develop the open-source path forward for AI storage. Establish and guide an open-source strategy that broadens the AI storage ecosystem. Advocate for a Git Hub-first, security-first stance. Engage deeply with upstream open-source communities. Formalize the APIs, SDKs, and protocols allowing partners and the industry to build, integrate, and create with NVIDIA at the AI storage level.

* Lead an engineering culture centered on AI tools. Regularly use modern AI coding and agentic tools in your daily tasks. Show what 10× engineering means tribute patterns, prompts, and evaluation harnesses across the storage organization.

* Partner with peer Distinguished and Principal storage architects across the organization to tackle the most difficult, long-term technical challenges. Make automation the only acceptable solution for infrastructure management tasks like live software upgrades, node and drive replacements, capacity rebalancing, cross-DC data movement, and dataset lifecycle. Establish root-cause analysis and corrective action rigor on every major incident. Design the storage layer for workloads spanning the next several GPU generations, including disaggregated inference with storage-backed KV caching, large-scale write-once-read-many inference patterns, exabyte regional object stores, and cross-DC dataset versioning and copy management.

* Mentor and develop senior, principal, and distinguished engineers across the storage organization and nearby business units. Raise the technical bar broadly. Represent NVIDIA externally in standards bodies, open-source communities, customer briefings, and industry forums (FAST, SC, OCP, SNIA, Linux Storage…

Distinguished Engineer, Storage - AI Cloud