×
Register Here to Apply for Jobs or Post Jobs. X

HPC Engineer

Job in 201301, Noida, Uttar Pradesh, India
Listing for: HCLTech
Full Time position
Listed on 2026-02-14
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Job Description & How to Apply Below
Position Overview (Job Summary):
The role is for an  HPC Engineer  responsible for  designing, deploying, managing, and optimizing  an  on-premises High Performance Computing (HPC)  environment.
The environment includes  SLURM-managed CPU and GPU clusters .
Strong emphasis on  HPC architecture, Linux administration, job scheduling, and cluster operations .

Experience with  parallel/distributed storage (WekaFS, Scality)  is  preferred but optional .
Primary

Skills:

HPC Operations & Cluster Management (CPU & GPU)
SLURM Workload Manager (Mandatory) Install/configure/manage SLURM across multiple clusters
Partitions/queues, fairshare, job priority, scheduling policies
Upgrades, migrations, automation via API/integrations
Linux System Administration (RHEL focus) OS patching, hardening, tuning, package management
Troubleshooting & Performance Optimization Cluster health, node/job failures, bottlenecks, utilization optimization
Parallel Computing Knowledge MPI, OpenMP, distributed execution fundamentals
Secondary Skills (Preferred / Optional):
Storage / Parallel File Systems

WekaFS  (preferred optional)
Scality RING / ARTESCA  (preferred optional)
GPU Computing Exposure NVIDIA drivers, CUDA familiarity, GPU scheduling concepts
Monitoring Tools Grafana, Prometheus
Automation / Scripting Bash/Python for workflows, tooling, ops automation
HPC Ecosystem Components Infini Band/100G networking, monitoring tools, storage tiering concepts
SLURM-based HPC clusters
Linux (RHEL) administration
Multi-node distributed systems
(Optional) Storage platforms like  WekaFS / Scality

Role and Responsibilities:

A.

Key Responsibilities
1) HPC Infrastructure & Operations
Manage day-to-day operations of  on-prem CPU & GPU clusters
Monitor  health, performance, utilization ; ensure  availability & efficiency
Implement best practices for:
HPC operations
user management
resource administration
Troubleshoot:
networking issues
node failures
job failures
performance bottlenecks
User support:
job submissions
resource usage
HPC workflows
2) SLURM Workload Manager (Mandatory)
Configure/install/manage SLURM across  multiple clusters
Manage:
queues
partitions
node allocation policies
fair share policies
job prioritization
Handle:
SLURM upgrades
migrations
maintenance activities
Work with SLURM APIs/integrations for:
automation
custom workflows
Optimize scheduling for  mixed CPU/GPU workloads
3) Linux System Administration
Administer:
compute nodes
head nodes
admin servers
Perform:
OS updates
package installs
security patching
system tuning
Automate via:
shell scripting (Bash/Python)
4) Parallel Computing & Cluster Architecture
Understand and support workloads using:
MPI
OpenMP
distributed execution
Work with HPC building blocks:
high-speed interconnects (Infini Band/100G)
storage tiers
resource managers
monitoring tools
Diagnose and resolve:
parallel workload performance issues
B. Additional Responsibilities (Optional / Preferred Area)
5) Storage (Optional but Preferred)
A. WEKA (WekaFS)
Manage/tune parallel file system performance
Troubleshoot WekaFS issues with minimal downtime
Provide internal guidance and usage best practices
Track ecosystem improvements & recommend enhancements
B. Scality
Maintain and troubleshoot:
Scality RING
ARTESCA environments
Monitor/tune for high availability & reliability
Create documentation (configuration + SOPs)
Recommend performance improvements based on product enhancements
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary