×
Register Here to Apply for Jobs or Post Jobs. X

GPU Systems Engineer

Job in Bethesda, Montgomery County, Maryland, 20814, USA
Listing for: Base-2 Solutions, LLC
Full Time position
Listed on 2026-07-01
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing: Infrastructure & Operations, Unix/Linux
Job Description & How to Apply Below
Position: GPU Systems Engineer 3

Position Summary

Support enterprise AI mission systems by designing, developing, and optimizing GPU clusters, with deep focus on operating systems, hardware, GPU platforms, and high-speed networking in a secure customer environment.

Essential Duties and Responsibilities
  • Design, configure, and maintain GPU clusters.
  • Collaborate with a multidisciplinary team to define and optimize architectures for performance, power efficiency, and required features.
  • Work closely with AI/ML engineers to integrate GPUs with Linux-based systems.
  • Optimize GPU drivers for compatibility, reliability, and performance.
  • Analyze GPU performance, identify bottlenecks, and develop strategies to improve efficiency across hardware and software layers.
  • Build and maintain debugging tools, profiling utilities, and performance analysis software for Linux environments.
  • Leverage Bash, Python, Ansible, Puppet, and Salt for tooling and automation.
  • Maintain technical documentation, architectural specifications, and Linux best practices.
  • Support ATO activities and ensure compliance with federal security standards.
Required Qualifications
  • Active TS/SCI with ability to obtain a CI Polygraph.
  • Bachelor's degree with a minimum of six years of experience in the category field. Three additional years of experience may be substituted for the bachelor's degree.
  • Experience managing NVIDIA GPU data center platforms, including DGX, HGX, H200, H100, and L4s.
  • Knowledge of enterprise server components, including storage/network controllers, HBAs, and SSDs.
  • Strong expertise with Linux distributions, including RHEL, Ubuntu, Oracle, and Rocky.
  • Excellent problem-solving skills and the ability to collaborate within a team.
  • Meet DoD 8570.11 IAT Level II certification requirements at a minimum; IAT Level III is also acceptable.
  • U.S. citizenship is required due to the nature of the government contracts supported.
Preferred Qualifications
  • Experience with Kubernetes cluster management and AI/ML workflow orchestration, including Argo, Airflow, and Kubeflow.
  • Familiarity with GPU virtualization and cloud computing.
  • Experience with Prometheus and Grafana for monitoring.
  • Knowledge of distributed resource scheduling systems such as Slurm, LSF, or similar tools.
Required

Education and Experience Equivalency Required Certifications
  • DoD 8570.11 IAT Level II certification:
    Security+ CE, CCNA-Security, GICSP, GSEC, or SSCP.
Required Security Clearance
  • Active TS/SCI with ability to obtain a CI Polygraph.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary