AI Systems Performance Engineer Job Milpitas area,California USA,IT/Tech

Position: Staff AI Systems Performance Engineer
Company Description

Sandisk understands how people and businesses consume data and we relentlessly innovate to deliver solutions that enable today's needs and tomorrow's next big ideas. With a rich history of groundbreaking innovations in Flash and advanced memory technologies, our solutions have become the beating heart of the digital world we're living in and that we have the power to shape.

Sandisk meets people and businesses at the intersection of their aspirations and the moment, enabling them to keep moving and pushing possibility forward. We do this through the balance of our powerhouse manufacturing capabilities and our industry-leading portfolio of products that are recognized globally for innovation, performance and quality.

Sandisk has two facilities recognized by the World Economic Forum as part of the Global Lighthouse Network for advanced 4IR innovations. These facilities were also recognized as Sustainability Lighthouses for breakthroughs in efficient operations. With our global reach, we ensure the global supply chain has access to the Flash memory it needs to keep our world moving forward.

Job Description

The ideal candidate will be responsible for designing, defining, implementing, and enabling comprehensive benchmark tests for AI infrastructure platforms, including box-level GPU systems, multi-GPU servers, and GPU rack-scale deployments. This role requires a strong understanding of AI workloads, system architectures, and performance characterization methodologies across modern AI infrastructure environments.

The individual will work closely with Marketing, Product Management and System Architecture teams to understand benchmark requirements and translate business and customer use cases into measurable performance validation strategies. The candidate will develop benchmark proposals, define evaluation methodologies, and execute performance studies for a wide range of AI applications, including Chat Assistants and LLM inference, Retrieval-Augmented Generation (RAG), Speech AI, Vision AI, multimodal AI, recommendation systems, and Image/Video Generation workloads.

Responsibilities include selecting and optimizing benchmark frameworks, configuring AI software stacks, validating hardware and software performance, analyzing bottlenecks across compute, memory, storage, and networking subsystems, and generating detailed performance reports with actionable insights. The candidate will evaluate AI workloads across different hardware configurations such as GPUs, CPUs, accelerators, high-speed interconnects, NVLink/NVSwitch fabrics, storage architectures, and network fabrics to compare scalability, latency, throughput, power efficiency, and cost-performance metrics.

The role also involves collaborating with internal and external partners to enable emerging AI models, benchmark suites, and infrastructure technologies, while ensuring reproducibility, automation, and continuous benchmarking capabilities within AI lab environments. Strong analytical, scripting, and performance tuning skills are essential, along with hands-on experience in AI frameworks, GPU computing, distributed inference environments, and performance monitoring tools.

Essential Duties and Responsibilities:

• Design & Validate AI Infrastructure for Benchmarks

• Analysis of Benchmarks for End-to-end AI Infrastructure and develop test environment for Benchmark tests

• Define and Perform Benchmark tests for AI workloads on storage systems

• Evaluate GPU vs CPU vs storage bottlenecks in AI pipelines

• Research and innovate Benchmarks for AI workloads on storage specific to Inference & training for major models

• Design Benchmarks for Vector DB & KV cache

• Optimize Data pipelines for Inference and training

• Analyze Benchmark results, document and publish with recommendations

Skills:

- Experience with different Operating Systems (Windows, Linux, VMware).
- Scripting and/or programming languages, such as Shell scripts, Python, C/C++ are required.
- AI Infrastructure & Hardware Awareness - GPU/CPU architecture basics
- Experience in Performance & Benchmarking - profiling tools, system bottlenecks
- Debugging knowledge on performance bottlenecks…