More jobs:
AI Performance Engineer
Job in
San Francisco, San Francisco County, California, 94199, USA
Listed on 2026-05-29
Listing for:
Graphcore
Full Time
position Listed on 2026-05-29
Job specializations:
-
Software Development
AI Engineer
Job Description & How to Apply Below
Requirements
- BS/MS in Computer Science, Electrical Engineering, or related field ,
- Experience with distributed systems and communication libraries (MPI, NCCL, UCX, libfabric) ,
- Strong programming skills in C++ and Python ,
- Experience profiling and optimizing HPC or AI/ML workloads ,
- Familiarity with ML benchmarks such as MLPerf ,
- (Desirable) Experience with GPUs or accelerated computing architectures ,
- (Desirable) Knowledge of HPC networking and interconnect technologies (Infini Band, RoCE) ,
- (Desirable) Familiarity with ML frameworks such as PyTorch or Tensor Flow ,
- (Desirable) Understanding of ARM architectures and tool chains ,
- (Desirable) Strong debugging, profiling, and performance optimization skills
- Graphcore’s AI/ML training and inference infrastructure is rapidly scaling to meet the growing demands of AI workloads across mobile, edge, and datacenter environments ,
- This role focuses on optimizing performance across ARM-based architectures and large-scale distributed systems, ensuring efficiency, scalability, and reliability across the full hardware-software stack ,
- The System Engineering Performance team architects and optimizes high-performance infrastructure for large-scale datacenter deployments. The team works across hardware, software, networking, and system architecture to deliver cutting-edge AI solutions and ensure optimal system performance at scale ,
- Analyze ML models’ compute and memory requirements using roofline analysis and simulations ,
- Collaborate across hardware and software teams to optimize large-scale AI workloads ,
- Benchmark, monitor, and troubleshoot system performance across distributed systems ,
- Optimize communication stacks including MPI, NCCL, UCX, RDMA, and networking fabrics ,
- Profile and optimize AI workloads, focusing on performance bottlenecks ,
- Develop high-quality, ARM-compatible code and documentation
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×