Principal Engineer - Distributed AI Systems Architecture; Heterogeneous Compute Job Hillsboro area,Oregon USA,IT/Tech

Position: Principal Engineer - Distributed AI Systems Architecture (Heterogeneous Compute)
Job Details:

Job Description:

We are seeking a Principal Engineer to define and architect the next generation of distributed AI systems across heterogeneous compute platforms, including CPUs, GPUs, IPUs/FNICs/FNICs, and emerging dataflow accelerators.

This role focuses on one of the hardest problems in modern computing:

How to dynamically execute and optimize large-scale AI computation graphs across diverse hardware while managing state, locality, and performance at system scale.

You will operate at the intersection of systems architecture, high-performance computing, and AI infrastructure-defining the execution model, runtime abstractions, and placement strategies that turn a rack of heterogeneous devices into a coherent, programmable system.

Key Responsibilities

1. Dynamic Execution of Distributed Computation Graphs

* Define a runtime model for executing AI workloads as distributed computation graphs across heterogeneous resources

* Design abstractions for graph representation, dependencies, and execution semantics

* Enable dynamic scheduling and execution across CPUs, GPUs/specialized accelerators, and IPUs/FNICs., and specialized accelerators

2. Stateful Scheduling and Memory-Centric Architecture

* Architect systems where state (e.g., KV cache) is a first-class concern in scheduling and execution

* Distributed Inferencing solution:
Define models for data locality, memory hierarchy, and state ownership

* Optimize for minimal data movement and efficient access to distributed state

3. Graph Introspection and Automated Partitioning

* Develop mechanisms to analyze AI computation graphs and classify stages by:

o compute intensity

o memory bandwidth requirements

o communication cost

o latency sensitivity

* Drive automated or semi-automated partitioning of workloads across heterogeneous compute

4. Integration of Specialized Accelerators

* Architect frameworks that treat specialized accelerators (e.g., dataflow engines) as first-class execution targets

* Define execution boundaries, data exchange models, and integration strategies across device classes

* Enable interoperability across diverse compute paradigms without sacrificing performance

5. MoE-Aware Execution and Adaptive Placement

* Design runtime strategies for Mixture-of-Experts (MoE) models, including:

o expert placement

o routing locality

o load balancing vs data movement trade-offs

* Enhance existing frameworks for MOE and optimize communication path with IPUs/FNICs and compute path with Intel Accelerators.

* Enable adaptive execution based on real-time system signals (latency, utilization, skew)

6. Adaptive Runtime and Feedback-Driven Optimization

* Define observability and telemetry models for distributed AI execution

* Build feedback loops that continuously optimize placement, scheduling, and resource utilization

* Drive system-level performance across latency, throughput, and efficiency metrics

Qualifications:

Minimum Qualifications:

* Bachelor's or BS degree in Computer Science, Software Engineering, or a related specialized field, or equivalent experience per business needs.

* 12-plus years of experience with a Bachelor's degree

* Proven expertise in defining and implementing software architectures for AI frameworks, protocols, and algorithms.

* Deep experience in systems architecture, high-performance computing, or distributed systems

* Strong background in parallel or data-parallel computation models

* Experience with heterogeneous compute environments (CPU, GPU, DSP, or accelerators)

* Proven ability to design end-to-end systems from abstraction through implementation

* Strong understanding of performance trade-offs across compute, memory, and interconnect

Preferred Qualifications:

8-plus years of experience with a Master's degree, or 6-plus years of experience with a PhD.

* Experience with AI/ML systems, inference infrastructure, or large-scale model serving

* Familiarity with stream processing, dataflow models, or graph execution systems

* Knowledge of modern AI frameworks or runtimes

* Experience building developer-facing SDKs or programming models

* Background in performance optimization and benchmarking

Requirements listed would be obtained through a combination of…