×
Register Here to Apply for Jobs or Post Jobs. X

Senior Software Development Engineer - SGLang and Inference Stack

Job in Santa Clara, Santa Clara County, California, 95053, USA
Listing for: Advanced Micro Devices
Full Time position
Listed on 2026-02-23
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer, Software Engineer
Salary/Wage Range or Industry Benchmark: 110000 - 160000 USD Yearly USD 110000.00 160000.00 YEAR
Job Description & How to Apply Below

WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture.

We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond.

Together, we advance your career.

THE ROLE:

As a core member of the team, you will play a pivotal role inoptimizingand developing deep learning frameworks for AMD GPUs. Your work will be instrumental in enhancing GPU kernel performance, accelerating deep learning models, and enabling RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node systems. You will collaborate across internal GPU software teams and engage with open-source communities to integrateand optimize cutting-edgecompiler technologies and drive upstream contributions thatbenefit

AMD’s AI software ecosystem.

THE PERSON:

Skilled engineer with strong technical and analyticalexpertisein GPGPU C++, Triton, Tile Lang or DSL development within Linux environments. The ideal candidate will thrive in both collaborative team settings and independent work, with the ability to define goals, manage development efforts, and deliver high‑quality solutions. Strong problem‑solving skills, a proactive approach, and a keen understanding of software engineering best practices are essential.

KEY RESPONSIBILITIES:
  • Optimize Deep Learning Frameworks:
    Enhance performance of frameworks like Tensor Flow,PyTorch, andSGLangon AMD GPUs via upstream contributions in open‑source repositories.
  • Develop and Optimize Deep Learning Models:
    Profile, analyze, code change and tune large‑scale training and inference models foroptimalperformance on AMD hardware.

    Day-0 supports to many SOTA models, Deep Seek 3.2, Kimi K2.5, etc.
  • GPU Kernel Development:
    Design, implement, andoptimizehigh‑performance GPU kernels using HIP, Triton, Tile Lang or other DSLs for AI operator efficiency.
  • Collaborate with GPU Library and Compiler Teams:
    Work closely with internal compiler and GPU math library teams to integrate, optimize and align kernel‑level optimizations with full‑stack performance goals.

    Initiate and help with different level codegen optimizations.
  • Contribute toSGLangDevelopment:
    Support optimization, feature development, and scaling of theSGLangframework across AMD GPU platforms for LLM, multimodal serving and RL‑training.
  • Distributed System Optimization:
    Tune and scale performance across both multi‑GPU (scale‑up) and multi‑node (scale‑out) environments, including inference parallelism, prefill‑decode disaggregation, Wide‑EP and collective communication strategies.
  • Graph Compiler Integration:
    Integrate andoptimizeruntime execution through graph compilers such as XLA,Torch Dynamo, or custom pipelines.
  • Open‑Source

    Collaboration:

    Partner with external maintainers to understand framework needs, propose optimizations, and upstream contributions effectively.
  • Apply Engineering Best Practices:
    Leverage modern software engineering practices in debugging, profiling, test‑driven development, and CI/CD integration.
PREFERRED EXPERIENCE:
  • Strong Programming

    Skills:

    Proficient in C++ and/or Python (PyTorch, Triton, Tile Lang), with demonstratedability to code, debug, profile, and optimize performance‑critical code.
  • SGLangand LLM Optimization:
    Hands‑on experience with

    SGLangor similar LLM inference frameworks is highly preferred.
  • Compiler and GPU Architecture Knowledge:
    Background in compiler design or familiarity with technologies like LLVM, MLIR, orROCmis a plus.
  • Heterogeneous System Workloads:
    Experience running and scaling workloads on large‑scale, heterogeneous clusters (CPU + GPU) using distributed training or inference strategies.
  • AI Framework Integration:
    Experience contributing to or…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary