×
Register Here to Apply for Jobs or Post Jobs. X

Senior HPC Engineer

Job in Mountain View, Santa Clara County, California, 94039, USA
Listing for: ASRC Federal
Full Time position
Listed on 2025-12-19
Job specializations:
  • IT/Tech
    IT Support, Systems Engineer, Systems Administrator, Cloud Computing
Job Description & How to Apply Below

Posted Tuesday, December 16, 2025 at 5:00 AM

ASRC Federal is looking for a Senior HPC Engineer, as ASRC Federal Inu Teq provides High Performance Computing services across the full HPC lifecycle including computational requirements, architecture, acquisition, and operations for federal government customers, while promoting innovation, continuous standards-driven improvement, and industry best practices; this senior role supports the NASA NACS High Performance Computing contract by delivering continuous architectural enhancements and operational excellence, with the successful candidate serving as a proactive senior member of the team reporting to the Manager of the HPC Computer Systems and Storage (CSS) group and bringing extensive experience in designing, installing, maintaining, and upgrading large-scale HPC environments, including expertise with common batch schedulers such as PBS, Slurm, or Moab/Torque and Infini Band troubleshooting and optimization, while actively participating in day-to-day HPC operations such as system patching, OS upgrades, new system deployments, scripting, troubleshooting, testing, benchmarking, and user tool development, as well as directly supporting scientific users by diagnosing and reproducing application performance issues, analyzing trouble tickets for recurring patterns, and contributing to both system improvements and user education.

Key Responsibilities:
  • Design, deploy and maintain HPC clusters with over 2000+ nodes with Infini Band, 100+ petabytes of data storage in production.
  • Shepherd and/or contribute to scalable feature designs through the entire software development process, from requirements and use cases to release.
  • Design and develop scripts for system administration, monitoring and usage reporting.
  • Modify existing software to correct errors and/or improve performance.
  • Design and develop scripts for system regression test and performance (file systems (Lustre), scheduler (PBS), interconnect (HDR/NDR, Slingshot), high availability, etc.).
  • Troubleshoot, isolate and resolve application, system and other technical problems (hardware, software, and network).
  • Understand research use cases, research and deploy new technologies, defining cost, performance and other trade-offs.
  • Manage and maintain tools for provisioning, configuration management (HPCM, Ansible & GIT), resource management, scheduling and all necessary aspects of HPC in accordance with best practices.
  • Research, deploy and manage networking and security infrastructure, including development of policies and procedures.
  • Assist in developing and writing proposals and publications.
  • Create and provide clear documentation.
  • Mentor junior staff and cross‑train peers.
  • Provide after‑hours/weekend support as required.
  • Moderate Supercomputing System Administration that contributes to:
    • Day-to-day operations of the Linux HPC clusters and storage systems.
    • Proactive monitoring, analyze, and correct system issues.
    • Development of scripts to automate repetitive tasks or tools to enhance support of the HPC systems.
    • System performance analysis and tuning.
    • Building, installing, and supporting user‑requested software.
    • Supporting evaluation and assessment of new HPC technology.
    • Resolving user‑reported issues and managing support ticket requests in Remedy.
Requirements:
  • Bachelor’s degree in computer science or related field.
  • Strong computer science background with in‑depth systems‑level knowledge in operating systems and networking.
  • A minimum of 10 years of experience in the administration of HPC systems and scheduling software (PBS, Slurm, or Moab/Torque).
  • A minimum of 10 years of experience of systems programming in heterogeneous, multi‑platform HPC environments.
  • Strong ability to analyze, debug and maintain the integrity of an existing code base.
  • Demonstrated equivalence of 5 years of Linux/UNIX user support experience and hands‑on experience with administration of Linux systems.
  • Experience working with HPC applications and proficiency in at least C, C++, or Fortran.
  • Superior scripting skills and excellent attention to detail; proficiency in at least Python, Perl, or Bash.
  • Strong ability to interact with customers to understand needs, elicit…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary