×
Register Here to Apply for Jobs or Post Jobs. X

Senior Firmware Engineer, Edge AI​/NPU Runtime

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Tacit
Full Time position
Listed on 2026-06-17
Job specializations:
  • Software Development
    Embedded Systems/ Firmware/ IoT, AI Engineer (Applied/Software), Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below
Position: Senior Firmware Engineer, Edge AI / NPU Runtime

About Tacit

We are an early-stage, deep tech startup based in San Francisco, developing innovative hardware that rethinks human-computer interaction. We are backed by General Catalyst, Khosla Ventures, and Greylock Partners, with a founding team from Stanford, Brain Gate, Oculus, and Tesla. While we can’t reveal too much just yet, our team is tackling cutting-edge engineering challenges to bring revolutionary products to life.

About

the role

We’re looking for a Senior Firmware Engineer, Edge AI / NPU Runtime to help architect, optimize, and ship next-generation neurotech hardware with production-grade on-device intelligence. You will own critical parts of the embedded AI stack, from realtime sensor acquisition through preprocessing, NPU/DSP-accelerated inference, postprocessing, telemetry, and product deployment.

This is a hands‑on role for someone who wants to work close to the hardware while shaping the intelligence users experience in the product. You’ll help define how models run on-device, how sensor data moves through the system, and how we meet tight latency, reliability, and power budgets in real‑world use.

What you'll do
  • Edge AI & NPU Inference

    • Own deployment of ML models onto embedded targets using NPUs, DSPs, MCUs, or other hardware accelerators.

    • Integrate embedded inference runtimes, vendor NPU/DSP SDKs, and model deployment workflows into production firmware.

    • Optimize inference latency, memory footprint, throughput, power consumption, and accelerator utilization on production hardware.

    • Partner with ML teams on quantization, operator support, model architecture tradeoffs, calibration datasets, and accuracy/performance regressions.

  • Realtime Sensor-to-Inference Systems

    • Build realtime sensor-to-inference pipelines, including acquisition, time stamping, synchronization, preprocessing, feature extraction, model execution, and postprocessing.

    • Design low‑latency data movement using DMA, interrupts, ring buffers, deterministic scheduling, and efficient memory layouts.

    • Support streaming inference patterns such as sliding windows, temporal models, event‑driven execution, and continuous sensor processing.

    • Maintain inference quality and timing guarantees under real‑world conditions such as sensor noise, clock drift, dropped samples, variable system load, and power‑state transitions.

  • Power-Optimized Embedded Firmware

    • Optimize end‑to‑end energy per inference across sensing, preprocessing, model execution, postprocessing, and idle time.

    • Use low‑power firmware techniques such as sleep states, duty cycling, subsystem power gating, clock scaling, batching/windowing, and dynamic power management.

    • Profile and improve power consumption across sensors, CPU, NPU/DSP, memory, and supporting firmware infrastructure.

  • Product Quality & Debugging

    • Bring up and debug firmware across sensors, accelerators, power systems, embedded compute, and production hardware.

    • Use lab tools, traces, logs, telemetry, and instrumentation to root‑cause complex embedded system issues.

    • Translate product and customer experience goals into concrete latency, reliability, responsiveness, and power targets.

    • Build diagnostics, validation hooks, and performance benchmarks to ensure reliable real‑world edge inference behavior.

Requirements
  • 5+ years of experience in embedded firmware, embedded systems, or edge ML systems.

  • Strong C/C++/Rust experience on resource‑constrained embedded platforms.

  • Experience with
    RTOS‑based systems such as FreeRTOS, Zephyr, ThreadX, or similar.

  • Experience deploying or optimizing ML inference on embedded targets, NPUs, DSPs, MCUs, or edge SoCs.

  • Strong understanding of realtime embedded systems, including DMA, interrupts, concurrency, memory management, and low‑latency data movement.

  • Experience optimizing embedded systems for latency, memory footprint, throughput, and power consumption.

  • Hands‑on debugging and bring‑up experience across embedded hardware and firmware systems, with strong cross‑functional communication across firmware, ML, electrical, software, and product teams.

Strong candidates may have
  • Experience with
    embedded inference runtimes, deployment tool chains, or edge AI SoCs/accelerators such as Tensor Flow Lite Micro, ONNX…

Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary