×
Register Here to Apply for Jobs or Post Jobs. X

Senior ML Performance Engineer

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: Amadeus Search
Full Time position
Listed on 2026-02-17
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer, Systems Engineer, Data Engineer
Salary/Wage Range or Industry Benchmark: 60000 - 80000 USD Yearly USD 60000.00 80000.00 YEAR
Job Description & How to Apply Below

Position: Senior ML Performance Engineer
Location: SF Bay Area (US) or Toronto (Canada) – Hybrid
Employment Type: Full-Time
Industry: AI Infrastructure / Compiler SystemsOverview

A venture-backed AI infrastructure company is building a high-performance, portable compiler designed to let developers “build once, deploy anywhere.” This includes cloud, edge, and hybrid environments — all optimized for resource efficiency, scalability, and sustainable AI development.

The team is looking for a Senior ML Performance Engineer to architect and lead a Performance Testing Platform from the ground up, measuring and optimizing the performance of large language models (LLMs) before and after compiler optimization on modern GPU architectures.

This role sits at the intersection of ML systems, GPU architecture, and performance engineering, with high visibility into product quality and customer impact.

Key Responsibilities
  • Design and implement a comprehensive performance testing platform for LLM inference workloads across GPU clusters

  • Define benchmarking methodologies, metrics, and test suites (latency, throughput, memory utilization, power consumption, and model accuracy)

  • Establish baseline performance for unoptimized models and validate post-optimization improvements

  • Build automated pipelines for continuous performance validation across compiler releases and model updates

  • Investigate performance bottlenecks using GPU profilers and system-level monitoring

  • Collaborate with compiler engineers, ML engineers, and Dev Ops to integrate performance testing into development workflows

  • Create dashboards and reporting to track performance trends, regressions, and wins

  • Document best practices for GPU-based ML performance testing

Required Qualifications
  • 7+ years in performance engineering, benchmarking, or systems engineering roles

  • Strong knowledge of ML inference workloads, particularly transformer-based LLMs

  • Hands-on GPU programming and optimization experience (CUDA, ROCm, or similar)

  • Strong programming skills in Python and C/C++

  • Proven experience building performance testing infrastructure or benchmarking platforms from scratch

  • Experience with ML frameworks:
    PyTorch, Tensor Flow, ONNX Runtime, vLLM, Tensor

    RT-LLM

  • Proficiency with profiling and debugging GPU workloads

  • Experience with CI/CD systems and test automation frameworks

  • Strong analytical skills with the ability to design experiments, analyze results, and communicate findings clearly

Nice to Have
  • AMD GPU experience (Mi200/Mi300) and ROCm ecosystem

  • Compiler optimization knowledge

  • Distributed inference and multi-GPU workloads

  • ML model quantization, pruning, and optimization techniques

  • High-performance computing or systems-level optimization

  • Infrastructure-as-code experience:
    Kubernetes, Docker, Terraform

  • Contributions to open-source ML or systems projects

Personal Attributes
  • Detail-oriented — able to spot subtle regressions

  • Self-driven and accountable

  • Collaborative and team-oriented

  • Passionate about sustainable AI

  • Clear and effective communicator

Compensation & Benefits
  • Competitive salary, dependent on experience and location

  • Equity and bonus opportunities

  • Medical, dental, and vision coverage

  • Retirement savings plan

  • Additional wellness benefits

Why This Role Is Unique
  • Build the infrastructure that validates high-performance ML models

  • Influence core product quality and customer outcomes

  • Work in a highly technical, high-impact environment at the forefront of AI systems

  • Collaborate across a globally distributed team

#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary