Senior HPC and AI Cluster Administrator
Listed on 2026-05-21
-
IT/Tech
Cloud Computing, Systems Engineer, Cybersecurity, IT Support
At Accenture Federal Services, nothing matters more than helping the US federal government make the nation stronger and safer and life better for people. Our 13,000+ people are united in a shared purpose to pursue the limitless potential of technology and ingenuity for clients across defense, national security, public safety, civilian, and military health organizations.
Join Accenture Federal Services, a technology company within global Accenture. Recognized as a Glassdoor Top 100 Best Place to Work, we offer a collaborative and caring community where you feel like you belong and are empowered to grow, learn and thrive through hands‑on experience, certifications, industry training and more.
Join us to drive positive, lasting change that moves missions and the government forward!
AFS is looking for a Senior HPC and AI Cluster Administrator to support software and data solutions for our customers. We are integrating supercomputers and AI clusters based on existing technologies. We are looking for a system administrator to be a key player to enable artificial intelligence and GPU computing solutions.
You will work with many scientific researchers, developers, and customers to create improved workflows and develop unique solutions. You will interact with HPC, OS, GPU compute, and systems specialist to architect, develop and bring up large scale performance platforms.
Key Responsibilities- Design, Deploy, and maintain HPC/AI clusters
- Manage AI jobs workflows using various scheduling technology, such as Kubernetes.
- Support and maintain continuous integration and delivery pipelines
- Troubleshooting and fixing, bottom up from bare metal, operating system, software stack and application level
- Support Research, Development, and Operational activities.
- Bachelor's Degree in Computer Science, Engineering, or a related field; or equivalent experience
- 5 years of experience in any of the following:
- Knowledge of HPC and AI solution technologies to include hardware, hypervisors, CPUs and GPUs.
- Experience with job scheduling workloads and orchestration tools such as Slurm & K8s
- Excellent knowledge of Linux (i.e. Redhat, Ubuntu) networking (Routing, Switching) and internals, ACLs and OS level security protection and common protocols e.g. TCP, DHCP, DNS, etc.
- Experience with multiple storage solutions such as Lustre, GPFS, zfs and xfs. Familiarity with newer and emerging storage technologies.
- Automation and configuration management tools such as Python, Bash within a Gitops workflows.
- Knowledge of Networking Protocols like Infini Band, Ethernet
- Experience with private cloud platforms (for example VMware, Hyper‑V, KVM)
- Familiarity with public cloud computing platforms (e.g. AWS, Azure)
- Must possess and maintain required DoD 8140 certifications.
- Knowledge of GPU architectures, time‑slicing, Multi‑instance GPU (MIG)
- Experience with container orchestration technologies i.e. Kubernetes, Docker
- Experience designing, deploying AI workflow technologies such as Apache Airflow, Prefect, Dagster.
- Background with RDMA (Infini Band or RoCE) fabrics
- Experience working in regulated industries and applying compliance requirements (i.e. DISA STIG, CIS etc.)
- NVIDIA Certifications (AI Infrastructure, AI Operations, AI networking)
- VMWARE Certifications (Certified Professional / Advanced Professional)
- An active TS/SCI federal security clearance is required
As required by local law, Accenture Federal Services provides reasonable ranges of compensation for hired roles based on labor costs in the states of California, Colorado, Hawaii, Illinois, Maryland, Massachusetts, Minnesota, New Jersey, New York, Washington, Vermont, the District of Columbia, and the city of Cleveland
. The base pay range for this position in these locations is shown below. Compensation for roles at Accenture Federal Services varies depending on a wide array of factors, including but not limited to office location, role, skill set, and level of experience. Accenture Federal Services offers a wide variety of benefits. You can find more information on benefits here. () We accept applications on an on-going basis and there is no fixed…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).