×
Register Here to Apply for Jobs or Post Jobs. X

Member of Technical Staff, Generalist; Remote

Remote / Online - Candidates ideally in
San Francisco, San Francisco County, California, 94199, USA
Listing for: Inferact
Remote/Work from Home position
Listed on 2026-02-16
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 150000 - 200000 USD Yearly USD 150000.00 200000.00 YEAR
Job Description & How to Apply Below
Position: Member of Technical Staff, Exceptional Generalist (Remote)

About the Role

This is a globally remote opportunity. We're seeking exceptional generalist engineers who can work across the entire vLLM stack: from low-level GPU kernels to high-level distributed systems. This role is designed for self-directed, autonomous individuals who can identify the highest-leverage problems and solve them end-to-end without constant guidance.

You'll work asynchronously with our San Francisco headquarters while maintaining full ownership of critical infrastructure. You might be optimizing CUDA kernels one week, designing distributed orchestration systems the next, and implementing new model architectures the week after. The work you do will directly impact how the world runs AI inference.

Potential focus areas include:

  • Inference Runtime: Push the boundaries of LLM and diffusion model serving. Work at the core of vLLM to optimize how models execute across diverse hardware and architectures.

  • Kernel Engineering: Write the low-level kernels and optimizations that make vLLM the fastest inference engine in the world, running on hundreds of accelerator types.

  • Performance & Scale: Build the distributed systems that power inference at global scale—design foundational layers enabling vLLM to serve models across thousands of accelerators with minimal latency.

  • Cloud Orchestration: Build the operational backbone for cluster management, deployment automation, and production monitoring that enables teams worldwide to serve AI models without friction.

What We're Looking For

We're looking for engineers who thrive with autonomy. You should be able to take a vague problem statement and turn it into shipped code with minimal supervision. You communicate proactively, over-communicate context across time zones, and know when to ask for help versus when to push forward independently.

Core Requirements:

  • Bachelor's degree or equivalent experience in computer science, engineering, or similar

  • Demonstrated ability to work autonomously and drive projects to completion without close supervision

  • Excellent asynchronous communication skills and ability to collaborate effectively across time zones

  • Strong track record of shipping high-impact work in complex technical environments

  • Deep expertise in at least one of: systems programming, GPU/accelerator programming, distributed systems, or ML infrastructure

Technical Depth (strong in at least two):

  • CUDA kernels or equivalent (Triton, Tile Lang, Pallas) with deep understanding of GPU architecture

  • High-performance distributed systems in Rust, Go, or C++

  • Python with PyTorch internals and LLM inference systems (vLLM, Tensor

    RT-LLM, SGLang)

  • Kubernetes, container orchestration, and infrastructure-as-code at scale

  • Transformer architectures, KV-cache memory management, and model serving

Preferred Qualifications:

  • Contributions to vLLM or other major open-source ML/systems projects

  • Experience with multiple accelerator platforms (NVIDIA, AMD, TPU, Intel)

  • Knowledge of quantization techniques, ML-specific kernel optimization, or compiler technologies

  • Track record of improving system reliability and performance at scale

  • Written widely-shared technical blogs or impactful side projects in the ML infrastructure space

Logistics
  • Location: Fully remote, worldwide. We're timezone-flexible but expect regular overlap with Pacific Time for critical syncs.

  • Compensation: We offer competitive compensations (salary + equity) compared to the local market conditions.

  • Visa sponsorship: We sponsor visas on a case-by-case basis.

  • Benefits: Inferact offers competitive benefits appropriate to your location, including health coverage where applicable.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary