HPC Systems Engineer
Listed on 2026-02-16
-
IT/Tech
Systems Engineer
Nscale – HPC Systems Engineer
Are you passionate about Data Centre builds and large‑scale GPU infrastructure projects? Do you thrive in a fast‑paced, high‑growth environment where your work has a direct impact on business outcomes?
At Nscale we provide cost‑effective, high‑performance GPU cloud infrastructure for AI start‑ups and large enterprise customers. Our Engineering team drives the delivery of our GPU infrastructure.
About the roleNetwork engineers at Nscale are responsible for the design, deployment, and ongoing operation of all networking services that underpin both the internal management platform and the customer‑facing cloud infrastructure, including internet transit, WAN connectivity and DC networking. You will act as a 3/4th line escalation point for the support organisation.
What you’ll be doing- Designing, deploying and operating large‑scale HPC clusters and GPU‑based compute environments.
- Creating and maintaining hardware architectures, including BOMs, rack elevations and reference designs.
- Implementing and maintaining HPC scheduling and workload management systems (e.g., Slurm).
- Designing and optimising Infini Band and Ethernet network topologies (Fat Tree, Dragonfly, rail‑optimised configurations).
- Working with deployment teams to ensure cluster builds align with architectural specifications.
- Automating provisioning, configuration and operations of multi‑vendor HPC hardware and software stacks.
- Collaborating with software, infrastructure and datacenter teams to ensure seamless integration of HPC environments.
- Troubleshooting and tuning cluster performance across compute, storage and interconnect layers.
- Proven experience in designing, deploying and operating HPC or large‑scale compute clusters.
- Strong knowledge of Slurm or similar workload management systems (e.g., PBS, LSF).
- Proven experience in Infini Band networking design and operations, including subnet management, QoS, RDMA and performance tuning.
- Experience with high‑speed Ethernet networks and associated protocols (e.g., VLAN, LACP, BGP, OSPF, EVPN, VXLAN).
- Familiarity with HPC network topologies such as Fat Tree or Dragonfly.
- Experience creating hardware BOMs, rack layouts and reference architectures for compute deployments.
- Strong scripting skills in Python and/or Bash for automation and orchestration.
- Solid understanding of optics, cabling and physical layer design considerations for HPC and GPU cluster environments.
- Strong analytical, troubleshooting and documentation skills.
- A collaborative mindset and passion for building high‑performance, scalable infrastructure.
- Proactive and self‑motivated, with a strong sense of ownership.
- Thrives in a fast‑paced, dynamic and high‑growth environment.
- Collaborative team player with a passion for delivering outstanding candidate and stakeholder experiences.
- Strong attention to detail and documentation skills.
- Excellent communication skills, both written and verbal.
- A self‑starter mindset with a “see a problem, fix a problem” mentality.
Travel requirement: 20‑30% travel to our European sites.
What we can offer you- Highly competitive package (base + equity) with reviews every 12 months.
- Dynamic progression plan tailored to your ambitions.
- Human‑first flexibility: autonomous workday and flexible workplace.
We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio‑economic backgrounds. If there’s anything we can do to accommodate your specific situation, please let us know.
The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.
For information on how Nscale handles candidate personal data, please see our Employee & Candidate Privacy Notice:
Here.
Mid‑Senior level
Employment typeFull‑time
Job functionInformation Technology
#J-18808-LjbffrTo Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: