×
Register Here to Apply for Jobs or Post Jobs. X
More jobs:

Senior​/AI Research Engineer, Inference

Job in Milpitas, Santa Clara County, California, 95035, USA
Listing for: RoboForce
Full Time position
Listed on 2026-04-17
Job specializations:
  • Software Development
    AI Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Senior / Staff AI Research Engineer, Real-Time Inference

Senior / Staff AI Research Engineer, Real-Time Inference

Milpitas, CA

Responsibilities
  • Develop and optimize inference pipelines for embodied AI models (VLA, perception, world models) targeting real‑time execution on edge hardware such as NVIDIA Jetson platforms.
  • Implement CUDA‑level optimizations including custom kernels, memory layout tuning, and hardware‑aware graph compilation to minimize model latency.
  • Apply and advance model compression techniques — quantization (INT8/FP16/INT4), pruning, distillation, and structured sparsity — to achieve production‑grade throughput on constrained devices.
  • Profile and debug end‑to‑end inference stacks using tools such as Nsight, Tensor

    RT, and Triton to identify and eliminate performance bottlenecks.
  • Collaborate with ML research and robotics teams to co‑design model architectures that meet real‑time control‑loop latency requirements.
  • Establish benchmarking frameworks to evaluate model performance across latency, throughput, power consumption, and accuracy tradeoffs on target hardware.
Requirements
  • Master’s degree in Computer Science, Electrical Engineering, or related field with 4+ years of experience, or a PhD degree.
  • Deep expertise in CUDA programming, GPU architecture, and low‑level kernel optimization, including custom kernel authoring with tools such as Triton.
  • Hands‑on experience with model quantization, pruning, distillation, and deployment using frameworks such as Tensor

    RT, ONNX Runtime, TVM, or Triton.
  • Proficiency in C++ and Python; strong systems programming and performance profiling skills.
  • Experience deploying ML models on edge or embedded hardware (e.g., NVIDIA Jetson, Orin, or equivalent ARM/GPU SoCs).
  • Requires 5 days/week in‑office collaboration with the teams.
Bonus Qualifications
  • Familiarity with embodied AI models — VLA, multimodal transformers, or diffusion‑based policies — and their inference characteristics.
  • Familiarity with compiler‑based optimization pipelines such as XLA, torch.compile, or MLIR for graph‑level model acceleration.
  • Understanding of robotics system constraints such as control‑loop timing, sensor‑fusion latency, and memory bandwidth limits on edge SoCs.
  • Publication or production work in efficient deep learning or on‑device ML systems.
  • Competitive stock options/equity programs.
  • Health, dental, and vision insurance, 401(k) plan.
  • Visa sponsorship and green card support for qualified candidates.
  • Lunches and dinners, a fully stocked kitchen, and regular team‑building events.
#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary