Director,Product Management - AI Inference Platform Job San Jose area,California USA,IT/Tech

The future isn't just about technology - it's about people and the limitless possibilities AI can unlock for us all. As AI transforms how we live, work, and connect, Arm is at its very foundation, powering the innovations that are shaping our world.

Arm's Cloud AI team is building next-generation platforms to power the AI workloads of tomorrow, and we're growing our product management organization to help guide this journey. This is an opportunity to help shape the product direction of a major strategic investment while also helping build a new team and culture.

About the Role

We are seeking a Director / Principal Product Manager to lead the strategy and execution of a next-generation AI Inference Platform. This role sits at the intersection of hardware and software, defining how modern AI models are executed, served, and optimized at scale across distributed compute environments.

You will own the core platform stack-driving innovation in performance, efficiency, and scalability while partnering closely with engineering, research, and infrastructure teams to deliver world-class AI systems.

Responsibilities:

* Define and drive the product vision, strategy, and roadmap for large-scale AI inference systems

* Own product direction across compute execution, inference serving, and control plane systems

* Drive inference architecture and orchestration strategy, including distributed serving, routing, batching, and scheduling

* Define capabilities for KV cache management, memory optimization, and token lifecycle efficiency (prefill vs decode)

* Partner with engineering to enable hardware-software co-design, improving performance across accelerators and interconnects

* Shape the development of inference platform capabilities that deliver measurable gains in latency, throughput, and cost efficiency

* Own and evolve control plane services (APIs, policy engines, context/state management, usage/accounting)

* Translate complex technical systems into clear product value and customer impact

* Align multi-functional collaborators across infrastructure, research, and product organizations

* Define and track key performance metrics (P99 latency, TTFT, throughput, cost per inference/token, availability)

* Influence platform direction across constantly evolving AI workloads (LLMs, multimodal, agentic systems)

Required Skills and Experience :

* Proven experience in product management within AI/ML infrastructure, cloud platforms, or distributed systems

* Deep understanding of AI inference systems and large-scale serving architectures

* Solid understanding of LLM inference concepts (prefill vs decode, KV cache, token streaming)

* Demonstrated ability to deliver scalable, high-performance platform products

* Strong technical expertise and ability to collaborate directly with engineering teams

"Nice To Have" Skills and Experience :

* Experience with accelerator-based systems and performance optimization

* Familiarity with modern inference frameworks (e.g., TensorRT-LLM, vLLM)

* Experience working on production-scale AI systems

Why Join Arm?

You will play a critical role in shaping the future of AI infrastructure-defining how next-generation models are deployed, optimized, and scaled efficiently. This is an opportunity to work at the forefront of AI systems, solving complex challenges at the intersection of performance, scalability, and usability.

Salary Range:

$260,000-$351,800 per year

We value people as individuals and our dedication is to reward people competitively and equitably for the work they do and the skills and experience they bring to Arm. Salary is only one component of Arm's offering. The total reward package will be shared with candidates during the recruitment and selection process.

Accommodations at Arm

At Arm, we want to build extraordinary teams. If you need an adjustment or an accommodation during the recruitment process, please email To note, by sending us the requested information, you consent to its use by Arm to arrange for appropriate accommodations. All accommodation or adjustment requests will be treated with confidentiality, and information concerning these requests will only be disclosed as necessary to provide the accommodation.

Although this is not an exhaustive list, examples of support include breaks between interviews, having documents read aloud, or office accessibility. Please email us about anything we can do to accommodate you during the recruitment process.

Hybrid Working at Arm

Arm's approach to hybrid working is designed to create a working environment that supports both high performance and personal wellbeing. We believe in bringing people together face to face to enable us to work at pace, whilst recognizing the value of flexibility. Within that framework, we empower groups/teams to determine their own hybrid working patterns, depending on the work and the team's needs.

Details of what this means for each role will be shared upon application. In some cases, the flexibility we can offer is limited by local legal,…

Director, Product Management - AI Inference Platform