Developer - I/O Acceleration Job San Jose area,California USA,Software Development

Introduction

IBM is building its next‑generation Lakehouse analytics platform, engineered to deliver breakthrough price/performance on state of the art GPUs and high‑performance storage systems. The I/O Acceleration team builds the storage‑to‑GPU data path that makes that promise real — accelerated decompression, NVMe and parallel‑file‑system scheduling, and the runtime plumbing that keeps GPUs fed at line rate instead of waiting on bytes.

Your Role And Responsibilities

As an Engineer on I/O Acceleration, you will own and optimize the path data travels from disk to GPU memory. Your work will directly determine how many queries per dollar Gala Lakehouse can deliver on Blackwell — and whether modern analytics workloads run GPU‑bound (where they should) or I/O‑bound (where they are today).

What You’ll Do

Design, build, and optimize accelerated I/O and decompression paths for data‑intensive analytics workloads.
Improve end‑to‑end throughput across the storage > network > host > GPU boundary, eliminating copies, syscalls, and stalls.
Integrate with GPU‑aware runtimes and high‑bandwidth fabrics (GPUDirect Storage, RDMA, NVMe‑oF) and tune for Blackwell‑class hardware.
Build benchmarks and microbenchmarks that expose I/O cliffs, queue contention, and tail latency under realistic query mixes.
Instrument the data path so cost‑per‑query, bandwidth‑per‑GPU, and CPU overhead are first‑class, observable metrics.
Collaborate with the query engine, storage, and hardware teams to co‑design APIs that make accelerated I/O usable, not just possible.

Preferred Education

Bachelor's Degree

Required Technical And Professional Expertise

Strong modern C++ and deep comfort with Linux systems internals (page cache, O_DIRECT, , NUMA, scheduling).
Hands‑on experience in at least one of: storage I/O subsystems, decompression and codec implementation, or query‑engine data paths.
Working knowledge of GPU‑aware pipelines or adjacent acceleration frameworks (CUDA, GPUDirect, or similar).
Strong performance‑profiling and bottleneck‑isolation skills — you can read a flame graph, an nsys trace, and an fio result and know what to do next.
Familiarity with distributed data systems and the realities of running them at scale.
Track record of delivering production software in Agile, collaborative environments, including contributing to automated CI/CD pipelines.

Preferred Technical And Professional Experience

Production experience with GPUDirect Storage, RDMA, or NVMe‑oF integrations.
Exposure to ESS
6000, Lustre, GPFS, or other parallel and clustered file systems.
Track record of large‑scale benchmarking, and publishing or presenting performance results.
Contributions to open‑source data, storage, or GPU‑runtime projects (Arrow, cuDF, Velox, DuckDB, Spark, and similar).

#J-18808-Ljbffr

Developer - I​/O Acceleration

Developer - I/O Acceleration