Senior HPC Engineer, Classified Computing
Listed on 2026-02-08
-
IT/Tech
Systems Engineer, Cybersecurity, Cloud Computing
Senior HPC Engineer, Classified Computing
The Field Intelligence Operations Division invites candidates to apply to join the team as a Senior High Performance Computing (HPC) Engineer for Classified Computing to lead the design, implementation, and management of HPC systems within a classified environment. We are looking for candidates with extensive experience in HPC architecture, cluster management, and parallel computing, with a proven ability to work within highly secure and regulated environments.
This role involves close collaboration with security teams, scientists, and IT leadership to ensure that the HPC infrastructure meets the stringent performance, security, and compliance requirements necessary for classified work.
As part of our team, you will be joining a vibrant group of professionals eager to provide premier customer service to ensure people and information technology remain secure. The team is collaborative and strives to ensure security practices and procedures are understood, implemented, and enforced. All team members deliver ORNL’s mission by aligning behaviors, priorities, and interactions with our core values of Impact, Integrity, Teamwork, Safety, and Service.
As a U.S. Department of Energy (DOE) Office of Science national laboratory, ORNL has an impressive 80-year legacy of addressing the nation’s most pressing challenges. Our team is made up of over 7,000 dedicated and innovative individuals! Our goal is to create an environment where a variety of perspectives and backgrounds are valued, ensuring ORNL is known as a top choice for employment.
These principles are essential for supporting our broader mission to drive scientific breakthroughs and translate them into solutions for energy, environmental, and security challenges facing the nation.
Major Duties/Responsibilities:
- HPC System Design and Architecture: Lead the design and deployment of HPC systems, ensuring they meet the computational needs and security requirements of a classified environment.
- Create and maintain detailed documentation of HPC architectures, configurations, and operational procedures.
- Guide the architecture of the next-generation of GPUs through an intuitive and comprehensive grasp of how GPU architecture affects performance for datacenter applications, especially Large Language Models (LLMs).
- Drive the discovery of opportunities for innovation in GPU, system, and data-center architecture by analyzing the latest data center workload trends, Deep Learning (DL) research, analyst reports, competitive landscape, and token economics
- Cluster Management and Optimization: Oversee the installation, configuration, and management of HPC clusters, ensuring optimal performance, scalability, and reliability.
- Implement and manage job scheduling, resource allocation, and load balancing to maximize the efficiency of HPC resources.
- Security and Compliance: Ensure all HPC systems comply with security policies and regulatory requirements, implementing necessary controls and conducting regular audits.
- Collaborate with the security team to address vulnerabilities and ensure the protection of sensitive data within the HPC environment.
- Performance Tuning and Troubleshooting: Monitor and optimize the performance of HPC systems, identifying and resolving bottlenecks and inefficiencies.
- Identify and resolve complex issues, ensuring minimal downtime and disruption to critical operations.
- Collaboration and Leadership: Lead HPC-related projects, from initial planning and design through to implementation and operational support.
- Collaborate with scientists, researchers, and others to ensure that the HPC environment meets their computational needs.
- Mentor and support junior HPC engineers, sharing expertise and best practices.
- Continuous Improvement and Innovation: Research and remain informed of the latest advancements in HPC technologies, identifying opportunities for innovation and enhancement of the HPC infrastructure.
- Propose and implement improvements to existing systems and processes to support the evolving needs of the organization.
- Find opportunities where we uniquely can address customer needs, and translate these into compelling GPU…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).