×
Register Here to Apply for Jobs or Post Jobs. X

Senior Engineer GPU Kernel and Performance

Job in Federal Way, King County, Washington, 98003, USA
Listing for: DigitalOcean
Full Time position
Listed on 2026-06-09
Job specializations:
  • IT/Tech
    AI Engineer (Applied/Software), Systems Engineer
Salary/Wage Range or Industry Benchmark: 209000 USD Yearly USD 209000.00 YEAR
Job Description & How to Apply Below
Position: Senior Engineer 2: GPU Kernel and Performance

Dive in and do the best work of your career rney alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud. If you have a growth mindset, naturally like to think big and bold, and are energized by the fast‑paced environment of a true industry disruptor, you’ll find your place here. We value winning together—while learning, having fun, and making a profound difference for the dreamers and builders in the world.

Digital Ocean is seeking a Senior Engineer 2 to play a key technical role in our AI Inference Optimization team. Digital Ocean aims to be the Inference Cloud of choice for digitally native companies and you will help ensure we can offer the industry‑leading performance for our inference services. You will be responsible for the architectural decisions that maximize throughput and minimize latency for the world’s most advanced large models.

As an IC leader, you will act as a force multiplier for the engineering organization, solving the most complex bottlenecks in memory bandwidth and compute utilization while guiding the technical roadmap for our high‑performance inference fleet.

What You’ll Do
  • Performance Architecture:
    Lead the technical strategy for benchmarking and performance optimizations at the inference engine and GPU kernel layers, ensuring our infrastructure extracts maximum value from every TFLOP.
  • Deep‑Dive Optimization:
    Engineer solutions for complex performance issues, including attention layer optimizations, memory and precision management, and advanced parallelization across multi‑node GPU clusters.
  • Technological

    Innovation: Proactively implement cutting‑edge optimization techniques to keep Digital Ocean at the forefront of the Gen AI landscape. Some examples of projects you may work on:
    • Improving batch size performance using AMD’s AITER library for AMD MI355X – identify and tune AITER’s CK (composable kernel) or ASK (assembly) to optimize FP8 / BF16
    • Identify kernel fusion opportunities for GLM‑5 kernels for different layers of the Transformer block (Flash Attention, RMS Norm)
    • Tune expert gateway router kernels for MoE models like Qwen3‑235B, Deep Seek V3, GLM‑5 etc
  • Hardware & Ecosystem Mastery:
    Act as the subject matter expert on modern GPU families (NVIDIA/AMD) and their software stacks (CUDA, ROCm, Tensor

    RT, OpenAI Triton), advising on hardware procurement and software integration.
  • Technical Mentorship:
    Lead by example through high‑quality code and design reviews, elevating the technical bar for the team without the administrative overhead of direct management.
  • Strategic

    Collaboration:

    Partner with Product Management and TPMs to translate "theoretical hardware limits" into "shippable product features," ensuring our platform is both powerful and developer‑friendly.
  • Community Leadership:
    Maintain a strong presence in the GPU infrastructure and model performance optimization communities, contributing to and integrating the best of open‑source AI.
What You’ll Bring To Digital Ocean
  • Technical Depth: 5+ years of experience in high‑performance computing or AI infrastructure, with a proven track record of solving compute utilization and memory bandwidth bottlenecks.
  • Gen AI Literacy:
    Deep familiarity with the Gen AI (LLM, VLM, LMM) landscape, including the specific quirks and architectural requirements of major model families.
  • Optimization Expert:
    Hands‑on experience with attention‑layer optimizations and parallelization strategies across distributed GPU environments.
  • Hardware Fluency:
    Comprehensive understanding of NVIDIA and AMD GPU architectures and their respective software ecosystems (CUDA, ROCm, etc.).
  • Open Source Mastery:
    Extensive experience integrating, building with, and contributing to open‑source software projects.
  • Systems Design:
    Excellent system design skills, particularly related to low‑level GPU programming – optimization, memory access patterns, and parallel execution.
  • Leadership through Influence:
    Experience acting as a technical lead, driving design and delivery through cross‑functional alignment and expert‑level delegation.
Compensation Range
  • $ to $209,000
  • This is a remote role
Why You’ll Like Working for Digital Ocean
  • We…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary