More jobs:
Senior HPC and AI Cluster Administrator
Job in
Tampa, Hillsborough County, Florida, 33646, USA
Listed on 2026-05-22
Listing for:
Accenture Federal Services
Full Time
position Listed on 2026-05-22
Job specializations:
-
IT/Tech
Cloud Computing, Systems Engineer, Cybersecurity
Job Description & How to Apply Below
Position
Senior HPC and AI Cluster Administrator – Accenture Federal Services (AFS)
Key Responsibilities- Design, deploy, and maintain HPC/AI clusters.
- Manage AI job workflows using scheduling technologies such as Kubernetes.
- Support and maintain continuous integration and delivery pipelines.
- Troubleshoot and fix issues at bare metal, operating system, software stack, and application levels.
- Support research, development, and operational activities.
- Bachelor’s degree in Computer Science, Engineering, or related field; or equivalent experience.
- 5+ years of experience in HPC/AI solution technologies including hardware, hypervisors, CPU/GPU.
- Experience with job scheduling workloads and orchestration tools such as Slurm & Kubernetes.
- Excellent knowledge of Linux (Redhat, Ubuntu), networking (routing, switching), ACLs, and OS level security protection.
- Experience with storage solutions such as Lustre, GPFS, zfs, xfs, and emerging storage technologies.
- Automation and configuration management skills using Python, Bash within Git Ops workflows.
- Knowledge of networking protocols such as Infini Band and Ethernet.
- Experience with private cloud platforms (VMware, Hyper‑V, KVM).
- Familiarity with public cloud platforms (AWS, Azure).
- Must possess and maintain required DoD 8140 certifications.
- Knowledge of GPU architectures, time‑slicing, and multi‑instance GPU (MIG).
- Experience with container orchestration technologies like Kubernetes and Docker.
- Experience designing AI workflow technologies such as Apache Airflow, Prefect, Dagster.
- Background with RDMA (Infini Band or RoCE) fabrics.
- Experience in regulated industries applying compliance requirements (DISA STIG, CIS, etc.).
- NVIDIA Certifications (AI Infrastructure, AI Operations, AI Networking).
- VMware Certifications (Certified Professional / Advanced Professional).
- An active TS/SCI federal security clearance is required.
Position Requirements
10+ Years
work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×