More jobs:
HPC Support Engineer
Job in
Charlottesville, Albemarle County, Virginia, 22905, USA
Listed on 2026-06-02
Listing for:
SAIC
Full Time
position Listed on 2026-06-02
Job specializations:
-
IT/Tech
Systems Engineer, IT Support, Cloud Computing
Job Description & How to Apply Below
* SAIC is looking for a highly qualified
** HPC Support Engineer
** to support the Army's Golden Dome initiative. The engineer will support users executing workloads within Linux-based High Performance Computing (HPC) cluster environments used for distributed compute workloads, simulation environments, and GPU-enabled processing.
The environment will include:
+ multi-node Linux compute clusters
+ workload scheduling platforms such as Slurm or PBS
+ distributed parallel compute workloads utilizing MPI or OpenMP
+ GPU-enabled compute resources supporting CUDA-based processing
+ high-performance networking technologies including RDMA / Infini Band
The system will be used to support scientific computing, simulation workloads, and other distributed compute operations within a secure research environment.
Candidates should be comfortable working within cluster-scale computing environments where performance, scheduler configuration, and distributed workload execution are critical operational factors.
The HPC Support Engineer will assist users executing computational workloads within HPC cluster environments.
The role focuses on:
+ supporting distributed compute workloads
+ troubleshooting job execution issues
+ assisting users with scheduler job submission scripts
+ identifying workload performance bottlenecks
+ supporting GPU-enabled workloads
+ promoting efficient cluster utilization and HPC best practices
Candidates should have experience working with distributed compute workloads and Linux-based HPC environments.
** Core Technical Capabilities*
* Candidates should demonstrate capability in most of the following areas.
** HPC Workload Execution*
* Experience supporting execution of distributed workloads on HPC cluster platforms.
Candidates should understand how compute workloads interact with cluster schedulers, compute nodes, and distributed resources.
** Workload Scheduling Platforms*
* Experience executing and troubleshooting workloads using schedulers such as:
+ Slurm
+ PBS / PBS Pro
+ Torque
+ Grid Engine
Candidates should understand job submission workflows and resource allocation concepts such as CPU, memory, and GPU scheduling.
Candidates should be comfortable reading and troubleshooting scheduler job submission scripts used to execute distributed workloads.
** Linux Systems Usage*
* Strong Linux experience including:
+ command-line system usage
+ execution of compute workloads within Linux environments
+ troubleshooting application execution issues
Experience with RHEL-based environments is preferred.
** Distributed Compute Workloads*
* Experience supporting distributed workloads utilizing parallel computing frameworks such as:
+ MPI
+ OpenMP
Experience supporting the compilation and execution of scientific or engineering applications within Linux HPC environments.
Familiarity with common HPC programming languages and compiler tool chains including:
+ C/C++ Fortran
Candidates should understand how compiled applications interact with scheduler configuration, compute resources, cluster networking, and distributed runtime environments.
Experience troubleshooting application build or runtime issues related to compiler configuration, library dependencies, or MPI environments is desirable.
Familiarity with common HPC compiler tool chains such as GCC, Intel, or LLVM-based compilers is desirable.
** GPU Compute Workloads*
* Experience executing or supporting workloads utilizing GPU-enabled compute environments and CUDA frameworks is desirable.
** Performance Troubleshooting*
* Ability to identify issues affecting workload execution including:
+ inefficient resource allocation
+ scheduler configuration issues
+ application execution failures
+ distributed compute performance bottlenecks
** Automation and Operational Tooling*
* Experience writing scripts or tooling using languages such as:
+ Bash
+ Python
Automation experience supporting workload execution or operational tasks is beneficial.
** Qualifications*
* Candidates must meet the following requirements:
+ Bachelor degree in science/technology; 4 additional YoE can be substituted for degree
+ 8+ years of experience is required
+ Minimum 5 years of experience working in Linux environments…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×