×
Register Here to Apply for Jobs or Post Jobs. X

Compute; HPC Software Engineer – HPC SW Systems

Job in Ann Arbor, Washtenaw County, Michigan, 48113, USA
Listing for: KLA-Belgium
Full Time position
Listed on 2026-05-18
Job specializations:
  • IT/Tech
    Systems Engineer, Hardware Engineer
Salary/Wage Range or Industry Benchmark: 105900 - 180000 USD Yearly USD 105900.00 180000.00 YEAR
Job Description & How to Apply Below
Position: High Performance Compute (HPC) Software Engineer – HPC SW Systems
Company Overview

KLA is a global leader in diversified electronics for the semiconductor manufacturing ecosystem. Virtually every electronic device in the world is produced using our technologies. No laptop, smartphone, wearable device, voice-controlled gadget, flexible screen, VR device or smart car would have made it into your hands without us. KLA invents systems and solutions for the manufacturing of wafers and reticles, integrated circuits, packaging, printed circuit boards and flat panel displays.

The innovative ideas and devices that are advancing humanity all begin with inspiration, research and development. KLA focuses more than average on innovation and we invest 15% of sales back into R&D. Our expert teams of physicists, engineers, data scientists and problem-solvers work together with the world’s leading technology providers to accelerate the delivery of tomorrow’s electronic devices. Life here is exciting and our teams thrive on tackling really hard problems.

There is never a dull moment with us.

Job Description /Preferred Qualifications

Key Responsibilities

HPC Software Engineering Design, develop, and optimize HPC software running on large-scale Linux clusters, including distributed and parallel workloads (MPI, multithreading, GPU-accelerated pipelines, containerized workloads).Optimize application performance and power utilization across CPU, memory, storage, and network subsystem, with attention to throughput, latency, and scaling behavior.

Develop and maintain system-level tooling for cluster bring-up, diagnostics, monitoring including component power usages, and health checks.

Work closely with algorithms, systems and application teams to understand and translate workload characteristics into power-efficient HPC software solutions.

HPC Systems & Hardware Awareness Collaborate with hardware and systems teams to define HPC node, storage, and interconnect requirements based on software and algorithm needs.

Understand and influence CPU/GPU selection, memory sizing, PCIe layout, NUMA behavior, and network topology to ensure optimal software performance.

Participate in HW/SW co-debug activities, including performance bottlenecks, stability issues, and failure analysis.

Rack & Infrastructure Engineering Understand rack-level integration of HPC systems, focusing on power, cooling, cabling, networking, and physical layout considerations.

Understand data-center and lab constraints such as power budgets, thermal limits, network drops, and serviceability.

Contribute to best practices, and design reviews for new platforms and refresh cycles.

Cross-Functional Collaboration Act as a technical bridge between software, hardware, systems teams.

Provide clear technical documentation covering software and system architecture, deployment flows, performance assumptions.

Required Qualifications Bachelor’s or Master’s degree in Computer Science, Computer Engineering, Electrical Engineering, or equivalent practical experience.

Strong experience developing HPC or systems software on Linux.

Proficiency in Java and/or C++ and/or other system-level or performance-oriented languages.

Hands-on experience with parallel computing (MPI, OpenMP, multithreading). Candidates with GPU computing (CUDA, ROCm, or equivalent) would be preferred.

Solid understanding of HPC hardware fundamentals: CPUs, memory hierarchies, storage, networking (Ethernet / Infini Band).Practical experience working with clusters, servers, or rack-scale systems in lab or production environments.

Strong debugging skills across software, OS, and hardware boundaries.

Preferred Qualifications

Experience with containerized HPC environments (Docker, Singularity/Apptainer, Kubernetes in HPC contexts).Familiarity with high-speed interconnects, storage architectures, and performance benchmarking.

Exposure to rack integration, including cabling, power distribution, cooling, and system bring-up.

Experience in semiconductor, manufacturing, or high-reliability systems environments.

Ability to reason about system reliability, MTBF/MTBA, and failure modes in large compute installations.

What Makes This Role Unique at KLAWork on mission-critical HPC platforms that directly…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary