×
Register Here to Apply for Jobs or Post Jobs. X

Senior Software Engineer, Distributed Systems - NIM Factory

Job in Santa Clara, Santa Clara County, California, 95053, USA
Listing for: NVIDIA Corporation
Full Time position
Listed on 2026-06-06
Job specializations:
  • Software Development
    Software Engineer, AI Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Senior Software Engineer, Distributed Systems - NIM Factory page is loaded## Senior Software Engineer, Distributed Systems - NIM Factory locations:
US, CA, Santa Clara:
US, TX, Remote:
US, NY, Remote:
US, CA, Remote time type:
Full time posted on:
Posted Todayjob requisition :
JR2010745

NVIDIA is the platform upon which every new AI-powered application is built. We are seeking a senior engineer to design and build factory infrastructure and automation for NVIDIA Inference Microservices (NIMs). The right person for this role brings technical drive and creativity to change the way NVIDIA optimizes and serves performant inferencing for every AI model in a heterogeneous cluster environments.

Our NIM offerings are easy to use, highly performant and tested in all deployment scenarios, in the cloud, on customer’s self-hosted infrastructure and locally on all NVIDIA GPUs. You will apply your deep technical expertise to design an efficient, scalable and reliable automation factory infrastructure that will take AI models to become NIMs that are validated for best in class performance and accuracy.

NVIDIA is building a new category of products, by intersecting our prowess in deep learning and computing, with industry-leading technologies. You will harness groundbreaking technologies, and build a highly efficient factory to power how NVIDIA builds and validates NIMs for inferencing all the way through deployment in heterogeneous hardware and software environments. You will influence and drive technical advances in NVIDIAs workflows and build the infrastructure that strives to accelerate the delivery of every AI model on NVIDIA's GPUs anywhere.

We are looking for technical talent to design and build our factory capabilities, including the underlying infrastructure, pipelines, backends, Docker build, test harness, metrics, performance engineering, log ingestion, and more.
** What you'll be doing:
*** Develop a factory pipeline that will take an AI model in and produce a deployable service that is validated across Cloud, On-prem and Kubernetes environments. With the team, define and deliver rapid iterations on the group's technical strategies and roadmaps to deliver and improve the NIM factory. You will be designing interfaces, data modeling and schema design, and expanding observability over the factory pipeline and its compute infrastructure.
* Work with technical leaders designing and developing scalable and reliable factory components. You will collaborate with multiple AI model teams to understand their requirements to build an efficient infrastructure that improves every teams' productivity.
* Define metrics and drive improvements based on user feedback. You will mentor and collaborate throughout the team and with other teams to grow your colleagues and yourself. You will have a history of learning and growing your skills and those around you.
** What we need to see:
*** A history of using your advanced programming skills to build distributed and compute systems, backend services, microservices and cloud technologies.
* Effective experience working with multi-functional teams, principals and architects, across organizational boundaries.
* Mentorship, growing teams and team members, and the flexibility to ability to adjust your direction and expectations given the needs of our customers.
* Deep technical expertise in distributed containerize applications using technologies such as Docker, K8s, Cloud Endpoints, Helm, and Prometheus.
* Passion for building rich, microservice applications build and test automation pipeline.
* Excellent interpersonal skills and the ability to lead multi-functional efforts
* Proven experience debugging and analyzing the performance of distributed microservices or cloud systems.
* BS or MS in Computer Science, Computer Engineering or related field (or equivalent experience)
* 8+ years of shown experience developing performant microservice, cloud software and/or tooling roles
** Ways to stand out from the crowd:
*** Experience delivering event-driven applications using various services such as Temporal, Kafka, Redis or others and a demonstrable ability to discuss the pros and cons of these choices.
*…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary