×
Register Here to Apply for Jobs or Post Jobs. X

Principal ML Infrastructure Engineer

Job in Dallas, Dallas County, Texas, 75215, USA
Listing for: Franklin Fitch
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer, Systems Engineer
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below
Position: Principal ML Infrastructure Engineer (Relocation Available)

Overview

AI Infrastructure Engineer (GPU Systems & Model Deployment) (Principal and Entry level available)

We are seeking an AI Infrastructure Engineer to design and optimize high-performance systems that enable machine learning models to run reliably and efficiently in production environments. This role is focused on GPU-accelerated inference, low-latency model serving, and bridging the gap between research models and real-world deployment. You will work closely with ML researchers and software engineers to ensure models are production-ready, scalable, and performant.

This is a hands-on systems role with a strong emphasis on C++, CUDA, and GPU inference optimisation
.

Core Responsibilities
  • Design and maintain GPU-accelerated infrastructure for deploying machine learning models in production
  • Build and optimize high-throughput, low-latency inference pipelines
  • Develop and maintain performance-critical components in C++
  • Optimize GPU utilization through CUDA programming and kernel tuning
  • Support model conversion, optimization, and deployment using inference runtimes
  • Partner with ML researchers to transition models from experimentation to production
  • Diagnose and improve system performance relative to baseline benchmarks
  • Ensure deployed systems are reliable, observable, and maintainable in production environments
Required Qualifications
  • Masters or PhD required
  • Strong C++ expertise with experience writing and optimizing production-grade systems
  • Hands-on CUDA programming experience and GPU performance optimization
  • Solid understanding of GPU architectures and memory management
Preferred / Nice-to-Have Qualifications
  • Experience with TensorRT or similar GPU inference runtimes
  • 1–7 years of experience as a Software Development Engineer supporting production model deployment
  • Experience with model optimization, quantization, or runtime acceleration techniques
  • Exposure to ML frameworks (e.g., PyTorch, Tensor Flow) from a systems or deployment perspective
  • Experience working with containerized environments and CI/CD pipelines
Tech Environment (Representative, Not Exhaustive)
  • C++, CUDA
  • GPU inference runtimes (e.g., Tensor

    RT)
  • Linux, containers, cloud or on-prem GPU systems
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary