HPC Technical Consultant, Onsite; LANL Los Alamos, NM
Listed on 2026-06-13
-
IT/Tech
IT Support, Hardware Engineer, Systems Engineer, Technical Support
Overview
This role has been designed as onsite with an expectation that you will primarily work from an HPE partner/customer office.
Who We AreHewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next.
We know varied backgrounds are valued and succeed here. We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good. If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.
Join a dedicated on-site team supporting operations and hardware maintenance for HPE supercomputers in one of the nation’s premier High-Performance Computing facilities.
US Citizenship Required
Onsite daily work required in Los Alamos, NM. This is not a remote position
Days/
Hours:
M-F, 8am to 5pm or 7am to 4pm
- Monitor and maintain system health across large-scale HPC compute, network, and storage infrastructure
- Troubleshoot and repair hardware issues on HPC servers and supporting systems
- Perform basic Linux system administration tasks as needed
- Create, monitor, update, and close support tickets
- Perform hardware component replacements using spares
- Operate hand tools and low-power tools for server maintenance
- Track and document hardware repairs, part replacements, and returns
- Create, update, and maintain site documentation, processes, and workflows
- Assist with new system installation and expansion activities
- Read system documentation and diagrams to locate components
- Collaborate with team members using email, Teams, Slack, and in-person communication
- Participate in on-call schedule to support 24x7 operations
- Maintain tools and workspace in an organized manner
Candidates must meet all of the following requirements:
- Ability to obtain a Q Clearance (required)
- US Citizenship (required)
- Must be able to work onsite 5 days per week in Los Alamos, NM, with additional onsite work for on-call support. This is not a remote position
- Strong mechanical aptitude and comfort using common hand tools (screwdrivers, pliers, wrenches, cable tools, etc.) for assembling, disassembling, and maintaining server hardware and related equipment
- Ability to lift up to 50 lbs individually and up to 75 lbs with assistance
- Solid understanding of computer hardware components (servers, drives, memory modules, power supplies, cabling, and peripherals)
- Proficiency with basic computer operations on Windows and macOS (Mac Book), including OS navigation, file management, and standard productivity tools such as Slack, SharePoint, Microsoft Office (Word, Excel, Outlook, and Teams)
A combination of the following is preferred:
- Associate’s degree, some college, or technical training (BS preferred)
- 2+ years of Linux System Administration Experience, including strong command-line navigation, log analysis and monitoring (journalctl, syslog, log files), troubleshooting system and application issues, and scripting/automation using Bash or Python.
- Experience using Redfish (along with IPMI) for out-of-band server hardware management and monitoring. This includes utilizing the Redfish RESTful API for querying system health, power/thermal monitoring, firmware inventory, component status, event logs, and performing actions such as system resets, power control, and BIOS configuration.
- 2+ years of hands-on experience troubleshooting and maintaining server hardware in a datacenter environment, including diagnosing hardware faults, performing component replacements, rack mounting/decommissioning servers, and managing cable infrastructure
- 1+ year of experience with high-speed networking concepts and troubleshooting for Ethernet, HPE Slingshot, and Infini Band fabrics, including link diagnostics, performance tuning, cable/fiber management, switch configuration, and fault isolation…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).