×
Register Here to Apply for Jobs or Post Jobs. X

AI Performance Architect

Job in San Diego, San Diego County, California, 92189, USA
Listing for: Qualcomm
Full Time position
Listed on 2025-12-13
Job specializations:
  • Engineering
    Systems Engineer, AI Engineer, Hardware Engineer, Computer Science
  • IT/Tech
    Systems Engineer, AI Engineer, Hardware Engineer, Computer Science
Job Description & How to Apply Below
Position: Staff AI Performance Architect

Staff AI Performance Architect

Apply for the Staff AI Performance Architect role at Qualcomm.

Company

Qualcomm Technologies, Inc.

Job Area

Engineering Group / Machine Learning Engineering

General Summary

Today, more intelligence is moving to end devices, and mobile is becoming a pervasive AI platform. At the same time, data centers are expanding AI capability through widespread deployment of ML accelerators. Qualcomm envisions making AI ubiquitous—expanding beyond mobile and powering other end devices, data centers, vehicles, and things. We are inventing, developing, and commercializing power‑efficient on‑device AI, edge cloud AI, data center and 5G to make this a reality.

We are looking for AI Accelerator Architecture Engineers to drive functional, performance and power enhancements into the hardware to enable state‑of‑the‑art training capabilities. AI inference and training systems must scale to a large number of accelerators, servers and racks. Our devices must be designed to scale to handle the largest of today’s models.

The AI Architecture team is comprised of experts that span the full gamut from software architecture, algorithm development, kernel optimization, down to hardware accelerator block architecture and SOC design. The ideal candidate will augment the team by contributing to one or many of these areas.

Minimum Qualifications
  • Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field AND 4+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
  • Master’s degree in Computer Science, Engineering, Information Systems, or related field AND 3+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
  • PhD in Computer Science, Engineering, Information Systems, or related field AND 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
Responsibilities
  • Understand trends in ML network design through customer engagements and latest academic research to determine how this will affect both SW and HW design.
  • Work with customers to determine hardware requirements for AI training systems.
  • Analyze current accelerator and GPU architectures.
  • Architect enhancements required for efficient training of AI models.
  • Design and architecture of:
    • Flexible Computational Blocks involving a variety of datatypes (floating point, fixed point, micro‑scaling).
    • Flexible Computational Blocks involving a variety of precision (32/16/8/4/2/1).
    • Capability to optimally perform dense and sparse GEMM, GEMV.
  • Design memory technology and subsystems optimized for a range of requirements:
    • Capacity
    • Bandwidth
    • Compute in Memory, Compute near memory.
  • Scale‑Out and Scale‑Up Architectures:
    • Switches, No

      Cs, code‑synchronization with communication collectives.
    • Optimized for power.
    • Ability to perform competitive analysis.
    • Code‑sign HW with SW/GenAI (LLM) requirements.
    • Define performance models to prove effectiveness of architecture proposals.
    • Pre‑Silicon prediction of performance for various ML training workloads.
    • Perform analysis of performance/area/power trade‑offs for future HW and SW ML algorithms, including impact of SOC components (memory and bus).
Requirements
  • Master’s degree in Computer Science, Engineering, Information Systems, or related field.
  • 3+ years of Hardware Engineering experience defining architecture of GPUs or accelerators used for training of AI models.
  • In‑depth knowledge of NVIDIA/AMD GPU capabilities and architectures.
  • Knowledge of LLM architectures and their HW requirements.
Preferred Skills And Experience
  • Knowledge of computer architecture, digital circuits, and hardware simulators.
  • Knowledge of communication protocols used in AI systems.
  • Knowledge of Network‑on‑Chip (NoC) designs used in System‑on‑Chip (SoC) designs.
  • Understanding of various memory technologies used in AI systems.
  • Experience in modeling hardware and workloads to extract performance and power estimates.
  • High‑level hardware modeling experience preferred.
  • Knowledge of AI training systems such as NVIDIA DGX and NVL
    72.
  • Experience training and fine‑tuning LLMs using distributed training frameworks such as Deep Speed, FSDP.
  • K…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary