More jobs:
Capacity Operations and Analytics Manager
Job in
Santa Clara, Santa Clara County, California, 95053, USA
Listed on 2026-06-03
Listing for:
NVIDIA Gruppe
Full Time
position Listed on 2026-06-03
Job specializations:
-
IT/Tech
Cloud Computing: Infrastructure & Operations
Job Description & How to Apply Below
Responsibilities
- Manage and optimize GPU capacity and other compute resources across multiple cloud service providers to meet growing demands and ensure efficient utilization.
- Build, develop, and maintain data models, reporting systems, data automation pipelines, dashboards, and performance metrics that support infrastructure governance programs and strategic capacity decisions.
- Analyze the technical and business needs for GPU capacity and compute resources from internal and external teams.
- Identify performance bottlenecks in daily usage of compute resources and collaborate with infrastructure teams to resolve them.
- Drive infrastructure resource‑efficiency initiatives in partnership with engineering, finance, and product teams.
- Develop and enhance tooling for the cloud infrastructure and analytics platform to optimize resource usage and performance, including automation workflows and the application of AI techniques to extract useful insights.
- Partner and cross‑collaborate with Finance, Product, Service Owners, and Infrastructure Engineering to align cloud capacity management with company goals and develop KPIs that reflect customer satisfaction.
- Lead multi‑year, budget‑based compute resource planning with engineering.
- Bachelor’s or Master’s degree in Computer Science, Software Engineering or a related field (or equivalent experience), and 10+ years of overall experience in cloud computing, specifically managing or sourcing GPU capacity with cloud service providers; proven track record of large‑scale computing operations and planning is a plus.
- Strong technical proficiency in cloud architecture, development and deployment, and managing large data sets.
- Deep understanding of cloud service models (IaaS, PaaS, SaaS) and cloud infrastructure technologies. Experience with AWS, Azure, GCP, and OCI is required.
- Demonstrated experience employing AI tools and techniques to extract useful signals and insights from data, specifically to improve resource usage and automation.
- Strong understanding and practical application of statistical modeling and machine learning methodologies for improving operational efficiency and informing strategic capacity decisions.
- Proficiency with data analytics, visualization, and monitoring tools such as Kibana, Grafana, Splunk, Prometheus, Tableau, and Plotly.
- Ability to operate effectively amid uncertainty and rapidly changing business conditions, with an agile mindset and a commitment to ongoing improvement.
Base salary ranges from 168,000 to 270,250 USD. Eligible for equity and benefits.
Applications open until June1,2026.
Equal Opportunity StatementNVIDIA is committed to fostering an inclusive work environment and is an equal opportunity employer. We do not discriminate on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×