Developer - I/O Acceleration
Listed on 2026-06-06
-
Software Development
Data Engineering
Introduction
IBM is building its next‑generation Lakehouse analytics platform, engineered to deliver breakthrough price/performance on state of the art GPUs and high‑performance storage systems. The I/O Acceleration team builds the storage‑to‑GPU data path that makes that promise real — accelerated decompression, NVMe and parallel‑file‑system scheduling, and the runtime plumbing that keeps GPUs fed at line rate instead of waiting on bytes.
Your Role And ResponsibilitiesAs an Engineer on I/O Acceleration, you will own and optimize the path data travels from disk to GPU memory. Your work will directly determine how many queries per dollar Gala Lakehouse can deliver on Blackwell — and whether modern analytics workloads run GPU‑bound (where they should) or I/O‑bound (where they are today).
What You’ll Do- Design, build, and optimize accelerated I/O and decompression paths for data‑intensive analytics workloads.
- Improve end‑to‑end throughput across the storage > network > host > GPU boundary, eliminating copies, syscalls, and stalls.
- Integrate with GPU‑aware runtimes and high‑bandwidth fabrics (GPUDirect Storage, RDMA, NVMe‑oF) and tune for Blackwell‑class hardware.
- Build benchmarks and microbenchmarks that expose I/O cliffs, queue contention, and tail latency under realistic query mixes.
- Instrument the data path so cost‑per‑query, bandwidth‑per‑GPU, and CPU overhead are first‑class, observable metrics.
- Collaborate with the query engine, storage, and hardware teams to co‑design APIs that make accelerated I/O usable, not just possible.
Bachelor's Degree
Required Technical And Professional Expertise- Strong modern C++ and deep comfort with Linux systems internals (page cache, O_DIRECT, , NUMA, scheduling).
- Hands‑on experience in at least one of: storage I/O subsystems, decompression and codec implementation, or query‑engine data paths.
- Working knowledge of GPU‑aware pipelines or adjacent acceleration frameworks (CUDA, GPUDirect, or similar).
- Strong performance‑profiling and bottleneck‑isolation skills — you can read a flame graph, an nsys trace, and an fio result and know what to do next.
- Familiarity with distributed data systems and the realities of running them at scale.
- Track record of delivering production software in Agile, collaborative environments, including contributing to automated CI/CD pipelines.
- Production experience with GPUDirect Storage, RDMA, or NVMe‑oF integrations.
- Exposure to ESS
6000, Lustre, GPFS, or other parallel and clustered file systems. - Track record of large‑scale benchmarking, and publishing or presenting performance results.
- Contributions to open‑source data, storage, or GPU‑runtime projects (Arrow, cuDF, Velox, DuckDB, Spark, and similar).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).