Senior HPC Linux Systems Engineer; E
Listed on 2026-03-01
-
IT/Tech
Systems Engineer, Cybersecurity
Requisition Id15958
OverviewOak Ridge National Laboratory (ORNL) is seeking a Senior HPC Linux Systems Engineer to serve as a technical leader supporting some of the most advanced computing environments in the world.
This evergreen posting represents multiple potential openings for senior-level roles across ORNL's high-performance computing ecosystem.
Senior HPC Linux Systems Engineers are recognized experts who lead the design, implementation, and optimization of complex HPC infrastructure. They manage large-scale technical projects, guide technical direction for their teams, and serve as trusted advisors to scientific and operational leadership across ORNL.
Major Duties and Responsibilities- Provide technical leadership in the design, integration, and administration of large-scale Linux-based HPC clusters, high-speed networks, and storage systems.
- Lead medium to large technical projects, coordinating requirements, schedules, and deliverables across internal and external stakeholders.
- Architect and deploy advanced infrastructure solutions supporting exascale-class and mission-critical computing environments.
- Serve as a technical mentor for HPC engineers, guiding best practices in automation, performance tuning, and system security.
- Develop, implement, and maintain configuration management and automation frameworks (e.g., Ansible, Puppet, Salt) to enhance reliability and reproducibility.
- Perform advanced system performance analysis, troubleshooting, and optimization, ensuring system scalability and long-term sustainability.
- Manage critical vendor and partner relationships, representing ORNL's technical requirements during procurement, integration, and system acceptance.
- Contribute to strategic planning and technology roadmaps, influencing unit goals and technical direction.
- Collaborate closely with scientists, researchers, and IT specialists to align infrastructure capabilities with research and security objectives.
- Ensure compliance with DOE cybersecurity standards, configuration baselines, and operational policies.
- Author technical documentation, present internal briefings, and communicate complex issues and resolutions to management and stakeholders.
- Participate in on-call rotations, maintenance windows, and incident response as needed to support 24x7 operations.
- Bachelor's degree in computer science, engineering, or a related technical field.
- A minimum of 8 years of relevant experience in Linux systems administration or HPC systems engineering.
- Demonstrated experience leading the design and deployment of HPC or large-scale distributed computing systems.
- Expertise with batch schedulers (SLURM, PBS, LSF) and parallel file systems (Lustre, GPFS/Spectrum Scale).
- Proven ability to lead technical projects from concept through implementation, balancing technical depth with project delivery.
- Strong proficiency in automation and infrastructure-as-code frameworks (Ansible, Puppet, Salt).
- Advanced scripting or programming skills (Python, Bash, Go) for automation and operational tooling.
- In-depth understanding of high-speed interconnects (Infini Band, Slingshot, Ethernet) and storage architectures.
- Experience managing identity and access management systems, including MFA, SSO, and zero-trust frameworks (Ping Federate, RSA Secure
ID, Entra ). - Experience integrating virtualization or containerization solutions (VMware, KVM, Apptainer, Podman) into HPC environments.
- Ability to manage client and stakeholder relationships across multiple directorates and technical disciplines.
- Excellent written and verbal communication skills, including the ability to present complex technical concepts to diverse audiences.
- Proven ability to influence technical strategy and mentor staff in a collaborative research environment.
This position requires the ability to obtain and maintain clearance from the Department of Energy. As such, this position is a Workplace Substance Abuse (WSAP) testing designated position. WSAP positions require passing a pre-placement drug test and participation in an ongoing random drug testing program.
About ORNLAs a U.S. Department of Energy (DOE) Office of Science national laboratory, ORNL has an impressive 80-year legacy of addressing the nation's most pressing challenges. Our team is made up of over 7,000 dedicated and innovative individuals! Our goal is to create an environment where a variety of perspectives and backgrounds are valued, ensuring ORNL is known as a top choice for employment.
These principles are essential for supporting our broader mission to drive scientific breakthroughs and translate them into solutions for energy, environmental, and security challenges facing the nation.
- Work on the world's most powerful supercomputers, including Frontier, the first system to achieve exascale performance.
- Enable breakthrough science in fields like fusion energy, climate modeling, AI, and national security.
- Collaborate with diverse teams of scientists, engineers,…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).