×
Register Here to Apply for Jobs or Post Jobs. X

Advisor - GPU Platform Engineering

Job in Indiana, Armstrong County, Pennsylvania, 15705, USA
Listing for: Eli Lilly and Company
Full Time position
Listed on 2025-11-27
Job specializations:
  • IT/Tech
    AI Engineer, Cloud Computing, Systems Engineer, Data Engineer
Salary/Wage Range or Industry Benchmark: 135000 - 213400 USD Yearly USD 135000.00 213400.00 YEAR
Job Description & How to Apply Below
At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism.

We give our best effort to our work, and we put people first. We’re looking for people who are determined to make life better for people around the world.

Come help us unlock the power of AI and HPC based POGPU and Accelerated Compute infrastructure!

The Cloud and Connectivity organization is seeking experts and leaders in AI and High-Performance Computing (HPC), and Nvidia DGX server management. This role will also focus on DGX Server mgmt., Spectrum X networking technologies, and Weka Storage integration to support cutting-edge AI/ML workloads.
** What You’ll Be Doing
** You will be driving the engineering and operations of advanced Linux platforms supporting AI and HPC workloads, managing Nvidia DGX systems using Mission Control, Base Command and Run:

AI, and optimizing Spectrum X networking and WEKA storage for AI/ML applications. You will play a crucial role in boosting productivity for our Advanced Intelligence and Data science teams through implementing advancements across our AI/HPC infrastructure tooling and operational excellence

You will work in our Infrastructure Hosting Platform area leading the strategy, engineering and development of Advanced Linux computing capabilities for AI/ML. Additionally, you would advise with our senior Linux platform engineer directing the global Linux strategy for on-premises private cloud and public IaaS Linux services.
** How You’ll Succeed**
* ** Be Bold** - You will bring a high learning agility and Infrastructure availability and reliability Engineer skills to help us enable the Lilly Technology strategy, identifying tech opportunities, and accelerate our cloud journey.
* ** Be Fast** - You will accelerate initiatives in areas such as: AI/ML acceleration, Infrastructure AI OPS automation, HPC management, and infrastructure as code to enable critical business projects.
* ** Be Proactive** - You will have groundbreaking chances to build secure, resilient, and reliable hybrid cloud services using proactive, predictive, and automated capabilities.
* ** Be Your Best** - You will learn about new technologies, AI/ML based HPC, large scale GPU clustering, Infrastructure as Code, and Enterprise Scale Hyper Cloud providers, agile ways of working, and willingness to become an expert.
** What You Should Bring
*** Expertise in Linux system administration, HPC environments, and Nvidia DGX server management.

Experience with Spectrum X networking and parallel file systems is essential. Strong scripting skills and familiarity with containerization and automation tools are highly valued.
* 6+ years of demonstrated experience in AI/ML and HPC workloads and infrastructure.
* Hands-on experience in using or operating High Performance Computing (HPC) grade infrastructure as well as in-depth knowledge of accelerated computing (e.g., GPU), storage (e.g., Weka), scheduling & orchestration (e.g., Slurm, Kubernetes, LSF), high-speed networking (e.g., Ultra-Ethernet, RoCE ), and containers technologies (Docker).
* Passion for continual learning and keeping abreast of new technologies and effective approaches in the AI/ML infrastructure field.
* Expertise in running and optimizing large-scale distributed training workloads using PyTorch (DDP, FSDP), NeMo, or JAX. Also, possess a deep understanding of AI/ML workflows, encompassing data processing, model training, and inference pipelines.
* Some proficiency in at least one scripting language such as Bash, Python, or equivalent.
** Basic Qualifications
*** Bachelor’s degree in computer science, Information Technology, or related technical field.
* 10+ years’ experience as a Linux OS/ Platform Engineer.
* Demonstrated experience leading a global large-scale Infrastructure project.
** Additional Information:
** Hybrid role located in Indianapolis, IN (relocation required)
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary