×
Register Here to Apply for Jobs or Post Jobs. X

HPC Linux Storage Engineer

Job in Oak Ridge, Anderson County, Tennessee, 37830, USA
Listing for: Oak Ridge National Laboratory
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Select how often (in days) to receive an alert:

Oak Ridge National Laboratory (ORNL), home to some of the world’s most powerful supercomputers, is seeking highly skilled professionals to support large-scale storage systems, high-speed parallel file systems, and archival solutions critical to advancing scientific discovery and innovation. As part of ORNL’s leadership-class computing ecosystem, you will play a vital role in designing, deploying, optimizing, and maintaining infrastructure that powers cutting-edge research across diverse scientific domains.

This evergreen posting represents multiple opportunities across ORNL’s high-performance computing (HPC) environment, supporting scalable, reliable, and secure computing and storage capabilities. Applications are reviewed on an ongoing basis as new positions become available to meet the dynamic needs of our world-class computing facility.

Job Duties and Responsibilities May Include:
  • Design and Management of Infrastructure: Architect, deploy, and manage large-scale storage systems and HPC platforms to support research, scientific, and enterprise workloads. Develop and implement solutions for structured, unstructured, and archival data storage, focusing on scalability, reliability, and performance.
  • Systems Analysis and Development: Apply systems analysis techniques to consult with users/customers, determine functional requirements, and design, test, or optimize storage and computational solutions tailored to their needs. Develop, document, and modify solutions, including system prototypes and automated workflows, to enhance operational efficiency.
  • Performance, Optimization, and Troubleshooting: Ensure the performance, availability, scalability, and security of diverse infrastructure environments. Diagnose and resolve complex operational challenges quickly and effectively, applying advanced performance optimization techniques for a wide range of workloads.
  • Collaboration and Best Practices: Work closely with stakeholders from research, technical, and operational teams to understand workflows, identify opportunities for improvement, and deliver effective solutions. Define, implement, and enforce best practices, standards, and procedures across projects and teams.
  • Automation and Innovation: Automate system configuration, provisioning, monitoring, and maintenance to reduce manual efforts and downtime. Evaluate emerging technologies and tools to continuously improve system capabilities, adapt to changing needs, and plan for future advancements.
  • Support and Maintenance: Support critical infrastructure through participation in a 24/7 on-call rotation and off-hours maintenance windows. Resolve hardware and software issues in coordination with vendors, ensuring minimal impact on operations.
Basic Qualifications
  • Bachelor’s degree in computer science, engineering, information technology, or a related field; and at least 5 years of professional experience managing Linux/UNIX systems in heterogeneous environments. An equivalent combination of education and experience will be considered.
  • Demonstrated experience with high-performance computing (HPC) storage systems and enterprise storage platforms (e.g., Lustre, GPFS, BeeGFS, or WEKA).
  • Proficiency in scripting languages (e.g., Python, Bash, Perl) and configuration management/automation tools (e.g., Ansible, Puppet, Git).
  • Strong communication, collaboration, and problem-solving skills with the ability to design and implement solutions independently.
Preferred Qualifications
  • Active DOE Q, DoD Top Secret, or TS/SCI clearance.
  • Hands-on experience with HPC cluster technologies, including job schedulers (e.g., SLURM) and system deployment tools (e.g., Warewulf, PXEboot, Bright Cluster Manager).
  • Expertise in high-performance parallel file systems, tape library systems, and storage networking technologies (e.g., RAID, ZFS, NVMe-oF, Infiniband).
  • Familiarity with performance monitoring tools (e.g., Grafana, Nagios), benchmarking systems, and I/O optimization techniques.
  • Experience with virtualization and containerization platforms (e.g., VMware, KVM, Podman, Apptainer).
  • Background in open source development, including submitting patches…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary