Senior Director, Advanced Systems; HPC
Listed on 2025-12-06
-
IT/Tech
Data Scientist, Cloud Computing
Department Research Computing
Category Information Technology
Job Type Full-Time
OverviewThe Senior Director of Advanced Systems in the Research Computing department within the Office of the Dean for Research, leads a team of technical experts and system administrators that design, deploy, maintain, and support Princeton's centralized high performance computing (HPC) systems. Based on the strategic vision defined by University leadership and campus partners, the Senior Director of Advanced Systems sets the strategic roadmap for advanced HPC systems, provides technical leadership to their team and campus partners, and is ultimately responsible for cluster design, implementation, schedule efficiency, and administration.
The Senior Director monitors research and technology trends in the computational research space and spearheads the technical design and implementation of services that meet campus needs. This requires the Senior Director to remain current with trends in data science, machine learning, deep learning and AI technologies while anticipating faculty needs and strategically advocating for growth in Princeton‑offered services. Responsible for understanding changes to campus need and technology evolution, the Senior Director of Advanced Systems plays a crucial role in balancing institutional investment with campus demand and new technology trends to meet the needs of our world class research community.
Technical Leadership
- Provide vision, technical expertise, and direct support for computational research needs in academic departments across the university, including faculty, professional research staff, postdocs, graduate students, and undergraduate students.
- Participate in and, in some cases, lead committees related to the governance and continued evolution of computational research clusters and services necessary to meet the needs of world class researchers at Princeton.
- Monitor research and technology trends in high performance computing and research computing support by reading literature and white papers, following relevant periodicals, participating in related forums, and attending relevant conferences.
- Remain current with the latest advancements in technologies supporting data science, machine learning, deep learning, and AI.
- Develop and maintain strong and positive vendor relationships to keep abreast of technical advances and act as a primary point of contact for all interactions.
- Work collaboratively with faculty, academic departments, research computing leadership, and senior administration to develop, evolve, and execute a coordinated strategic direction for deploying, operating, and supporting computational and data resources that meet the evolving needs of research and scholarship at the University.
- Fully comprehends the challenges facing users of the computational cluster. Monitor ticket queues, coordinate support, and elevate troubleshooting of complicated problems.
- Leverage team members' strengths to adapt to new challenges and requirements.
- Spearheads the technical design and deployment of centralized research computing systems, consulting with faculty to understand their needs, interfacing with vendors to match capabilities to requirements, and leading the technical team to implement systems that provide the best balance of institutional investment and computational performance.
- Lead the system administration for Princeton's centralized HPC systems that represent a significant capital investment to support production computational and data science research on campus and to enable researchers to scale to larger regional and national systems. The central HPC systems serve a large percent of the University's faculty and researchers from a majority of the academic departments.
- Develop short and long term strategies promoting the growth and support of high performance computing and research computing systems.
- Utilize scheduling and workload management software (currently SLURM) for job submission, resource allocation, and monitoring within an HPC environment.
- Design and optimize code for…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).