Senior Engineering Leader - AI Infrastructure and Inferencing Job Redwood City area,California USA,Software Development

About Gruve

Gruve is an innovative software services startup dedicated to transforming enterprises into AI powerhouses. We specialize in cybersecurity, customer experience, cloud infrastructure, and advanced technologies such as Large Language Models (LLMs). Our mission is to assist our customers in their business strategies utilizing their data to make more intelligent decisions. As a well‑funded early‑stage startup, Gruve offers a dynamic environment with strong customer and partner networks.

About

the Role

We're seeking an exceptional Senior Engineering Leader to build and lead a high‑performing engineering team focused on design and development of a distributed multi‑tenant AI inference SaaS platform. Platform development responsibilities include software design, development and testing for multiple domains such as inference engines (AI/ML, program and compiler analysis), core platform services, and observability. This role sits at the intersection of systems engineering, AI/ML operations, and product development, requiring both deep technical expertise and proven leadership capabilities.

As a leader at Gruve, you'll drive the technical vision and execution of critical infrastructure that enables our AI capabilities 'll work closely with cross‑functional partners including research scientists, product managers, and other engineering leaders to deliver robust, performant systems that power our AI products.

This position is based in the United States and reports to the SVP of Inferencing and Infrastructure Management.

Key Responsibilities

Team Leadership & Development:
Build, mentor, and scale a world‑class engineering team of 10‑15+ engineers. Foster a culture of technical excellence, collaboration, and continuous learning. Conduct performance reviews, career development planning, and succession planning.
Technical Strategy & Architecture:
Define and execute the technical roadmap for AI inference infrastructure, AI tool chains, and AI software development. Make critical architectural decisions that balance performance, scalability, maintainability, and cost.
Compiler Design & Optimization:
Lead the development of AI inference systems and optimizations for AI workloads, including graph optimization, kernel fusion, and hardware‑specific code generation to maximize inference performance.
AI Model Development & Deployment:
Oversee the end‑to‑end lifecycle of AI models from development through production deployment, including model fine‑tuning, quantization, distillation, and serving infrastructure.
Inference API & Platform Development:
Drive the design and implementation of scalable, low‑latency inference APIs and platforms that serve models reliably at production scale with strict SLA requirements.
Spec‑Driven Development:
Champion rigorous engineering practices including comprehensive technical specifications, design reviews, and documentation to ensure alignment and quality across complex projects.
Cross‑Functional

Collaboration:

Partner effectively with research, product, and business stakeholders to translate requirements into technical solutions and communicate progress, trade‑offs, and risks clearly.
Delivery & Execution:
Own quarterly planning, roadmap prioritization, and on‑time delivery of major initiatives. Establish metrics and KPIs to measure team performance and system health.

Basic Qualifications

10‑15+ years of software engineering experience with at least 5+ years in engineering leadership roles managing teams of 5+ engineers.
Proven track record of building and scaling high‑performing engineering teams in high‑growth technology companies.
Deep expertise in systems programming languages (C++, Go, Rust, or similar) and architecture design.
Strong background in AI model design, optimization, or adjacent systems‑level programming (LLVM, MLIR, XLA, or similar frameworks).
Hands‑on experience with AI/ML model development, training, and inference systems.
Experience with model fine‑tuning techniques and deployment optimization (quantization, pruning, etc.).
Demonstrated ability to design and build production‑grade APIs and distributed systems.
Strong understanding of spec‑driven development processes…


Increase/decrease your Search Radius (miles)



Job Posting Language