More jobs:
Product Lead, AI Infra
Job in
Mountain View, Santa Clara County, California, 94039, USA
Listed on 2026-06-12
Listing for:
GMI Cloud
Full Time
position Listed on 2026-06-12
Job specializations:
-
IT/Tech
Systems Engineer, Cloud Computing, SRE/Site Reliability
Job Description & How to Apply Below
We are hiring a Product Lead to own GMI’s Cluster Management Dashboard and core AI infrastructure control-planes.
This role sits at the intersection of software product, cloud infrastructure, PaaS, GPU clusters, Kubernetes, orchestration, monitoring, usage tracking, billing visibility, and enterprise platform operations.
You will work with engineering, design, GTM, B2B customer success, and leadership to turn complex infrastructure systems into clear product experiences for internal operators, platform engineers, developers, and enterprise customers.
Key Responsibilities- Own the product roadmap for the Cluster Management Dashboard, including cluster provisioning, health status, utilization, workload visibility, alerts, and operational controls.
- Define product requirements for GPU cluster management, workload orchestration, autoscaling, quota management, access control, resource allocation, and failure recovery.
- Translate infrastructure concepts such as Kubernetes, containers, networking, storage, scheduling, observability, reliability, and capacity planning into clear product workflows.
- Build dashboard and platform views for usage tracking, billing visibility, cost reporting, SLA status, capacity planning, and customer-level resource consumption.
- Monitor and report key platform health metrics, including usage, performance, uptime, utilization, incidents, operational KPIs, and customer adoption.
- Identify friction points across the developer, operator, and enterprise customer experience; coordinate fixes with engineering and support teams.
- Own launch readiness for dashboard and platform releases, including release checklists, documentation, internal enablement, and customer-facing updates.
- Track product lifecycle events, including feature launches, upgrades, deprecations, migrations, and platform changes.
- Bridge engineering capacity, infrastructure constraints, and commercial commitments to support enterprise customer needs.
- Partner with GTM and customer success to convert platform requirements into scalable product capabilities, not one-off custom solutions.
- 6–10+ years of product management experience in cloud infrastructure, developer platforms, PaaS, SaaS, AI infrastructure, data platforms, or enterprise software.
- Proven experience building cloud consoles, dashboards, developer platforms, internal tools, infrastructure platforms, or control-plane products.
- Strong knowledge of PaaS workflows, including self-service provisioning, deployment, access control, quota management, usage tracking, and lifecycle management.
- Familiarity with Kubernetes, containers, orchestration, autoscaling, scheduling, observability, networking, storage, and reliability concepts.
- Ability to work closely with engineering on GPU clusters, workload management, capacity planning, monitoring, incident visibility, and failure recovery.
- Strong product judgment and ability to simplify complex infrastructure systems into clear user workflows.
- Excellent cross-functional communication across engineering, design, GTM, B2B customer success, and leadership.
- Experience with GPU cloud, AI infrastructure, inference platforms, model serving, ML platforms, HPC, or cloud-native infrastructure.
- Familiarity with tools such as Kubernetes, Docker, or cloud-native observability platforms.
- Experience with enterprise platform requirements around SLA visibility, billing transparency, resource governance, security, compliance, and reliability.
- Prior experience working on products that improve resource utilization, infrastructure efficiency, customer transparency, or operational scalability.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×