Senior Product Manager,Software & Developer Platform Job Burlingame area,California USA,Software Development

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co‑optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart‑sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Quadric is seeking a Senior Principal Product Manager to own the software roadmap for the Chimera Graph Compiler (CGC) — the developer‑facing platform customers live in from pre‑silicon through production. This role drives the monthly SDK release train, sets pattern coverage and quantization strategy, and works directly with anchor customers to convert model gaps into engineering roadmap. You’ll partner with the CPO on strategy and with SW engineering on execution.

This role is based in Burlingame (on‑site), with quarterly travel to Japan, U.S. East Coast, and customer SW teams worldwide.

Responsibilities

Software release train. Own the monthly SDK release and quarterly major: contents, release sync, go/no‑go, release notes, and customer communications.
Pattern coverage roadmap. Decide which graph patterns CGC compiles next — attention variants, quantization schemes, normalization patterns — and sequence them against customer model requirements each quarter.
Market‑driven demo strategy. Lead with the market story and push it through every layer: demo, model zoo, pattern coverage, compiler work. Own what we publish and when.
Customer engagement. Present in technical reviews with anchor customers. Convert model gap lists into engineering‑ready roadmap entries.
Quantization and numerics. Own the roadmap for INT4 (W4A8/W4A16), FP8, OCP MX, and KV cache compression. Coordinate with HW PM on MAC capability and with customer SW teams on model format decisions ahead of silicon tape‑out.
Framework and runtime integrations. Define the integration strategy for GGML/llama.cpp, vLLM, ONNX Runtime, Execu Torch, and HF Optimum — deep partnership vs. thin reference.
Model zoo. Maintain a set of customer‑confidence models (LLM chat, BEVFormer, VLA, ADAS perception) that serve as forcing functions for compiler completeness.
Quarterly roadmap tours. Take the roadmap to anchor customers, prospects, and the field. Brief the PMM monthly on what shipped and how to position it.
Competitive intelligence. Track Synopsys Meta Ware, Arm Kleidi

AI, Ceva Neu Pro Studio, and NVIDIA Tensor

RT‑LLM. Brief exec and sales quarterly.
Safety and quality. Coordinate with the safety lead on ISO 26262 traceability and qualification artifacts.

Requirements

Domain — non‑negotiable. Shipped product on at least one of: NPU or AI accelerator IP/silicon stack; graph or ML compiler (TVM, MLIR, XLA, or proprietary); developer‑facing AI inference runtime or agent framework. "Adjacent" does not count.
Modern AI workload fluency — non‑negotiable. Ready conversation, no prep, on: agentic workflows and LLM serving, KV cache optimization, quantization schemes (AWQ, GPTQ, Smooth Quant, QAT vs. PTQ), datatypes (INT4, FP8, BF16, OCP MX), and inference platforms (vLLM, llama.cpp, Tensor

RT‑LLM, Execu Torch, ORT).
Shipping bar — non‑negotiable. You shipped a developer‑facing AI or compute product — SDK, runtime, compiler, or inference service — with real users and a release cadence you owned.
Agent‑pilled — non‑negotiable. You use agentic AI tools daily (Claude Code, Cursor, or equivalent) to produce work. Having read about agentic AI without integrating it is not sufficient.
Customer first — non‑negotiable. When engineering wants to build the elegant thing and the customer needs the workable thing, you take the workable thing every time.
5+ years in PM, with 3+ years on a developer‑facing AI/ML or compute platform.
Owned a release cadence — picked what ships, what slips, and defended the call.
Experience with in‑person technical customer reviews.
Bay Area resident or willing to relocate to Burlingame.

Preferred

ML background: graduate degree, published work, trained model, or OSS contribution.
Automotive Tier 1 engagement; ISO 26262 awareness.
Prior product work on a competing NPU/GPU/AI accelerator stack (Meta Ware, Arm ML, Ceva, Tensor

RT, Hailo, Tenstorrent, etc.).
OSS contributions to vLLM, llama.cpp, TVM, MLIR, ONNX Runtime, or Execu Torch.

Benefits

Competitive salary and meaningful equity.
Medical, dental, and vision plan options starting on day one.
401(k) retirement plan.
Flexible paid time off (unlimited, non‑accrual).
Company‑provided lunches and a stocked kitchen.
Monthly parking or Caltrain pass.
Downtown Burlingame office, walking distance from Caltrain.

#J-18808-Ljbffr

Senior Product Manager, Software & Developer Platform