×
Register Here to Apply for Jobs or Post Jobs. X

Principal Engineer – Distributed AI Systems Architecture; Heterogeneous Compute

Job in Austin, Travis County, Texas, 78716, USA
Listing for: Intel Corporation
Full Time position
Listed on 2026-06-04
Job specializations:
  • IT/Tech
    Systems Engineer, AI Engineer
Salary/Wage Range or Industry Benchmark: 255850 - 361200 USD Yearly USD 255850.00 361200.00 YEAR
Job Description & How to Apply Below
Position: Principal Engineer – Distributed AI Systems Architecture (Heterogeneous Compute)
** Welcome!**.Principal Engineer – Distributed AI Systems Architecture (Heterogeneous Compute) page is loaded## Principal Engineer – Distributed AI Systems Architecture (Heterogeneous Compute) locations:
US, California, Santa Clara:
US, Texas, Austin:
US, Oregon, Hillsborotime type:
Full time posted on:
Posted Todayjob requisition :
JR0283339#
** Job Details:**##

Job Description:

We are seeking a Principal Engineer to define and architect the next generation of distributed AI systems across heterogeneous compute platforms, including CPUs, GPUs, IPUs/FNICs/FNICs, and emerging dataflow accelerators.  This role focuses on one of the hardest problems in modern computing:  How to dynamically execute and optimize large-scale AI computation graphs across diverse hardware while managing state, locality, and performance at system scale.  

You will operate at the intersection of systems architecture, high-performance computing, and AI infrastructure-defining the execution model, runtime abstractions, and placement strategies that turn a rack of heterogeneous devices into a coherent, programmable system.

Key Responsibilities  1. Dynamic Execution of Distributed Computation Graphs  
• Define a runtime model for executing AI workloads as distributed computation graphs across heterogeneous resources  
• Design abstractions for graph representation, dependencies, and execution semantics  
• Enable dynamic scheduling and execution across CPUs, GPUs/specialized accelerators, and IPUs/FNICs., and specialized accelerators    
2. Stateful Scheduling and Memory-Centric Architecture  
• Architect systems where state (e.g., KV cache) is a first-class concern in scheduling and execution  
• Distributed Inferencing solution:
Define models for data locality, memory hierarchy, and state ownership  
• Optimize for minimal data movement and efficient access to distributed state    
3. Graph Introspection and Automated Partitioning  
• Develop mechanisms to analyze AI computation graphs and classify stages by:  o compute intensity  o memory bandwidth requirements  o communication cost  o latency sensitivity  
• Drive automated or semi-automated partitioning of workloads across heterogeneous compute    
4. Integration of Specialized Accelerators  
• Architect frameworks that treat specialized accelerators (e.g., dataflow engines) as first-class execution targets  
• Define execution boundaries, data exchange models, and integration strategies across device classes  
• Enable interoperability across diverse compute paradigms without sacrificing performance    5. MoE-Aware Execution and Adaptive Placement  
• Design runtime strategies for Mixture-of-Experts (MoE) models, including:  o expert placement  o routing locality  o load balancing vs data movement trade-offs  
• Enhance existing frameworks for MOE and optimize communication path with IPUs/FNICs and compute path with Intel Accelerators.  
• Enable adaptive execution based on real-time system signals (latency, utilization, skew)    6. Adaptive Runtime and Feedback-Driven Optimization  
• Define observability and telemetry models for distributed AI execution  
• Build feedback loops that continuously optimize placement, scheduling, and resource utilization  
• Drive system-level performance across latency, throughput, and efficiency metrics##
*
* Qualifications:

****
* Minimum Qualifications:

**
* • Bachelor's or BS degree in Computer Science, Software Engineering, or a related specialized field, or equivalent experience per business needs.  
• 12-plus years of experience with a Bachelor's degree  
• Proven expertise in defining and implementing software architectures for AI frameworks, protocols, and algorithms.  
• Deep experience in systems architecture, high-performance computing, or distributed systems  
• Strong background in parallel or data-parallel computation models  
• Experience with heterogeneous compute environments (CPU, GPU, DSP, or accelerators)  
• Proven ability to design end-to-end systems from abstraction through implementation  
• Strong understanding of performance trade-offs across compute, memory, and interconnect    
**
* Preferred Qualifications:

*** 8-plus years of experience with a…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary