×
Register Here to Apply for Jobs or Post Jobs. X

Member of Technical Staff Inference

Job in Palo Alto, Santa Clara County, California, 94306, USA
Listing for: RadixArk
Full Time position
Listed on 2026-03-04
Job specializations:
  • IT/Tech
    Systems Engineer, AI Engineer
Job Description & How to Apply Below
Position: Member of Technical Staff -- Inference
About the Role

Radix Ark is seeking a Member of Technical Staff - Inference to push the limits of large-scale AI inference.

You will work on the core systems that serve frontier models at scale, optimizing performance, latency, throughput, and cost across thousands of GPUs. This role sits at the intersection of systems engineering, ML infrastructure, and performance optimization.

Your work will directly shape how state-of-the-art models are deployed and experienced by users worldwide.

This is a deeply technical, high-impact role for engineers who enjoy working close to the hardware-software boundary and solving performance-critical problems at scale.
Requirements
  • 5+ years of experience in systems engineering, ML infrastructure, or performance-critical backend systems
  • Strong expertise in large-scale inference systems for LLMs or generative models
  • Deep understanding of GPU architecture and performance characteristics
  • Experience optimizing latency- and throughput-critical production systems
  • Strong knowledge of distributed systems and networking fundamentals
  • Proficiency in C++, Rust, Go, or Python for production systems
  • Experience profiling and optimizing compute-intensive workloads
  • Strong debugging skills across system layers (model, runtime, kernel, network)
Strong Plus
  • Experience with LLM serving stacks (vLLM, Tensor

    RT-LLM, SGLang, etc.)
  • Familiarity with CUDA, Triton, or custom kernel optimization
  • Experience with batching, KV-cache management, and scheduling strategies
  • Experience running inference at scale (1000+ GPUs)
  • Background in HPC or high-performance systems
  • Open-source contributions in ML or systems infrastructure
Responsibilities
  • Design and build large-scale inference systems for frontier AI models
  • Optimize latency, throughput, and GPU utilization in production inference
  • Develop and improve model serving architectures and runtimes
  • Work on batching, scheduling, and memory management strategies
  • Collaborate with kernel, compiler, and systems teams on performance optimization
  • Debug performance bottlenecks across the stack
  • Drive reliability and scalability of inference infrastructure
  • Build tooling for observability, profiling, and performance analysis
  • Contribute to long-term inference architecture and strategy
About Radix Ark

Radix Ark is an infrastructure-first company built by engineers who've shipped production AI systems, created SGLang (20K+ Git Hub stars, the fastest open LLM serving engine), and developed Miles (our large-scale RL framework).

We're on a mission to democratize frontier-level AI infrastructure by building world-class open systems for inference and training.

Our team has optimized kernels serving billions of tokens daily and designed distributed systems coordinating 10,000+ GPUs across training and serving.

We're backed by leading infrastructure investors and collaborate with frontier AI labs and cloud providers.

Join us in building the infrastructure layer that powers the next generation of AI.
Compensation

We offer competitive compensation with meaningful equity, comprehensive benefits, and flexible work arrangements. Compensation depends on location, experience, and level.
Equal Opportunity

Radix Ark is an Equal Opportunity Employer and welcomes candidates from all backgrounds.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary