×
Register Here to Apply for Jobs or Post Jobs. X

Accelerator Runtime Lead

Job in Santa Clara, Santa Clara County, California, 95053, USA
Listing for: Velaura
Full Time position
Listed on 2026-06-18
Job specializations:
  • Software Development
    Software Engineer, Software Architect, C++ Developer, DevOps
Salary/Wage Range or Industry Benchmark: 200000 USD Yearly USD 200000.00 YEAR
Job Description & How to Apply Below

About Velaura

Velaura is building the next generation of compute platforms for Physical AI.

As AI moves beyond the datacenter into robots, autonomous mobile systems, drones, and other embodied systems, traditional compute architectures are increasingly constrained by power, memory bandwidth, latency, real-time requirements, and functional safety considerations.

Our mission is to develop the foundational compute technologies that enable intelligent systems to operate efficiently in the physical world.

We are assembling a team of exceptional architects and engineers to rethink how AI, sensing, memory, and control interact within a modern computing platform.

Role Overview

We are looking for an Accelerator Runtime Lead to own the execution runtime for Velaura’s AI accelerator.

This role will lead the user-space software layer that loads compiled model artifacts, manages execution state, coordinates memory and tensor lifetimes, invokes the accelerator driver, handles synchronization, exposes APIs, and provides profiling and telemetry to customers. The ideal candidate understands the boundary between compiler, runtime, driver, firmware, and application frameworks.

Responsibilities
  • Lead architecture and development of the AI accelerator runtime.
  • Own runtime APIs for loading, initializing, executing, profiling, and managing compiled model artifacts.
  • Define runtime memory management, tensor lifetime, buffer sharing, synchronization, batching, streaming, and multi-context execution behavior.
  • Work with the Accelerator Driver/Firmware Interface Lead on command submission, device queues, interrupts, reset/recovery, telemetry, and scheduling semantics.
  • Partner with the compiler team on artifact format, metadata, memory planning, execution plans, fallback behavior, version compatibility, and profiling hooks.
  • Support integration with C/C++, Python, ROS, and higher-level SDK APIs.
  • Define runtime error handling, logging, versioning, compatibility, and diagnosability for customer deployments.
  • Build benchmarking and profiling support to expose latency, throughput, memory usage, device utilization, and bottlenecks.
  • Establish runtime validation strategy, including correctness, stress, concurrency, recovery, compatibility, and performance tests.
  • Hire, mentor, and lead runtime engineers working across user-space libraries, framework integration, and SDK enablement.
Required Qualifications
  • Deep experience building runtime systems, accelerator APIs, embedded middleware, GPU/NPU runtimes, inference runtimes, or high-performance systems software.
  • Strong C/C++ programming skills and production software engineering discipline.
  • Strong understanding of device memory, DMA, synchronization, command submission, kernel/user-space boundaries, and performance-sensitive APIs.
  • Experience integrating with drivers, firmware, compilers, or low-level hardware interfaces.
  • Experience designing stable APIs, error handling, versioning, logging, and customer-facing diagnostics.
  • Ability to work cross-functionally with compiler, kernel, firmware, SDK, performance, and SQA teams.
  • Experience leading technical teams or major system architecture areas.
Preferred Qualifications
  • Experience with AI inference runtimes such as TensorRT-like, OpenVINO-like, ONNX Runtime Execution Providers, TFLite delegates, QNN/SNPE-like stacks, TVM runtimes, or similar.
  • Experience with heterogeneous compute, NPU/GPU/DSP accelerators, robotics workloads, or edge AI platforms.
  • Experience with multi-stream, low-latency, real-time, or high-throughput inference systems.
  • Familiarity with quantized model execution, tensor layout constraints, model artifacts, and graph execution.
  • Experience with profiling, tracing, telemetry, and runtime observability.
  • Familiarity with Linux device drivers, dma-buf, containers, ROS 2, or embedded Linux SDKs.
Compensation & Benefits

At Velaura, we believe exceptional talent deserves exceptional rewards. Compensation for this role includes a competitive base salary, performance-based incentives, and equity participation, allowing team members to share in the company’s long‑term success. The base pay range for this role is between $200k and $500k, and your base pay will…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary