Capacity Operations Manager
Job in
Santa Clara, Santa Clara County, California, 95053, USA
Listed on 2026-06-04
Listing for:
NVIDIA Corporation
Full Time
position Listed on 2026-06-04
Job specializations:
-
IT/Tech
Cloud Computing: Infrastructure & Operations, Data Science Manager
Job Description & How to Apply Below
US, CA, Santa Clara:
US, Remote time type:
Full time posted on:
Posted Todayjob requisition :
JR2014715
Our technology is limitless! NVIDIA is developing the world’s most innovative and groundbreaking computing platforms. Due to our work, scientists, researchers, and engineers are able to advance their ideas. At its essence, our visual computing technology offers not only an outstanding computing experience but also energy efficiency! We led the way in a supercharged style of computing embraced by the fastest-moving computer users globally—scientists, designers, artists, and gamers.
However, it’s more than just technology! It’s our people, some of the brightest in the world, and our company makes NVIDIA one of the most fun, inventive, and dynamic workplaces! At the core of NVIDIA are values such as innovation, excellence, determination, and collaboration that inspire us to achieve our best.
** What you will be doing:
*** Coordinate the development of High Performance Computing (HPC) clusters, collaborating closely with internal and external engineering teams.
* Direct and improve GPU capacity and additional compute resources across diverse cloud service platforms to satisfy rising needs and secure efficient deployment.
* Design, improve, and manage data models, reporting platforms, data automation solutions, dashboards, and performance measures that back NVIDIA Infrastructure governance programs and strategic capacity decisions.
* Assess the technical and business requirements for GPU capacity and other compute resources from different internal and external groups.
* Identify performance bottlenecks in day-to-day usage of compute resources and collaborate with relevant infrastructure teams to resolve them.
* Drive infrastructure resource efficiency initiatives in partnership with engineering, finance, and product teams.
* Develop and enhance tooling for our cloud infrastructure and analytics platform to optimize resource usage and performance for NVIDIA and its customers. This includes crafting and developing tools for automating workflows and potentially bringing to bear AI techniques to extract useful signals and insights from generated data.
* Partner and cross-collaborate with Finance, Product, Service Owners, and Infrastructure Engineering teams to align cloud capacity management with company goals and develop Infrastructure and Service Level benchmarks to match Customer satisfaction.
** What we need to see:
*** Bachelor's or Master's degree in Computer Science, Software Engineering, or a related field, or equivalent experience.
* 8+ years of overall experience in cloud computing, specifically in managing or using GPU capacity for high performance computing. A proven record of large-scale computing operations and planning is a plus.
* Strong technical proficiency in cloud architecture, development and deployment, and managing large data sets.
Experience with command line interfaces and shell scripting languages.
* Comprehensive knowledge of cloud service models (IaaS, PaaS, SaaS) and cloud infrastructure technologies. Practical experience with Cloud Service Providers including AWS, Azure, GCP, and OCI is essential.
* Demonstrated experience in bringing to bear AI tools and techniques to extract useful signals and insights from data, specifically to improve resource usage and automation.
* Deep knowledge and active use of statistical modeling and machine learning approaches for boosting operational efficiency and supporting strategic capacity decisions.
* Understanding of analytics, statistical modeling, and machine learning methodologies.
* Strong communication and relationship-building skills, with the ability to work well across different departments and contribute to strategic decisions.
* Self-starter, self-motivated, focused, and self-sufficient, with a willingness to learn new challenges and adapt quickly in a dynamic environment.
* Ability to operate effectively amidst uncertainty and rapidly changing business conditions, with an agile approach and a commitment to ongoing improvement.
NVIDIA is leading the way in…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×