×
Register Here to Apply for Jobs or Post Jobs. X

GPU Performance Engineer | Experienced

Job in Allentown, Lehigh County, Pennsylvania, 18103, USA
Listing for: SIG Susquehanna
Full Time position
Listed on 2026-06-26
Job specializations:
  • IT/Tech
    AI Engineer (Applied/Software)
Salary/Wage Range or Industry Benchmark: 120000 - 150000 USD Yearly USD 120000.00 150000.00 YEAR
Job Description & How to Apply Below
Position: GPU Performance Engineer | Experienced Hire

Overview

We are looking for aGPU Performance Engineer to build highly optimized CUDA kernels for low-latency inference. This role is focused on workloads where off-the-shelf runtimes and vendor libraries do not fully exploit the structure of the model, and where custom kernels, memory layouts, and execution strategies can deliver meaningful gains.

You will work closely with quantitative researchers and engineers to understand model structure,identify computational bottlenecks, and turn mathematical ideas into production-grade GPU implementations. You will use your understanding of GPU hardware to help shape models that are both mathematically effective and efficient to run. The problems span compact neural networks, tree-based models, and other structured inference workloads where latency, throughput, and efficiency all matter.

This role is a strong fit for someone who enjoys low-level optimization, performance analysis, and translating abstract models into hardware-efficient code.

What you’ll do
  • Design, implement, and optimize custom CUDA kernels for latency‑critical inference workloads
  • Develop fine‑grained GPU implementations tailored to specific model structures
  • Analyze quantitative research models and computational bottlenecks to identify opportunities for parallelization and hardware‑efficient execution
  • Collaborate directly with quantitative researchers to translate mathematical models into high‑performance compute pipelines
  • Optimize end‑to‑end inference performance through kernel tuning, memory‑layout design, execution strategy, I/O optimization, and precision tradeoffs
  • Profile and benchmark GPU performance
  • Improve latency and throughput in production inference systems
  • Contribute to GPU architecture decisions and performance best practices
What we’re looking for
  • Strong proficiency in writing and optimizing CUDA kernels
  • Solid programming experience in C/C++ (preferred)
  • Deep understanding of GPU architecture, including memory hierarchy, SIMT execution, occupancy, and latency/throughput tradeoffs
  • Ability to reason about numerical stability, precision, performance tradeoffs, and how model design choices affect hardware efficiency
  • Strong problem‑solving skills and comfort working with low‑level systems
Preferred qualifications
  • PhD in mathematics, physics, computer science, engineering, or related quantitative field
  • Strong background in linear algebra, probability, numerical methods, or scientific computing
  • Experience working with quantitative research teams or financial models
  • Demonstrated ability to improve real‑world inference performance beyond baseline framework or library implementations
  • Familiarity with PTX‑level behavior, tensor core utilization, or architecture‑specific tuning
  • Exposure to ONNX Runtime, TensorRT, Triton, TVM, or similar systems
  • Exposure to neural networks, tree‑based models (e.g., LightGBM), state space models (e.g., Mamba architectures), and experience with kernel fusion, custom operators, model compilation, or graph‑level optimization
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary