Senior Engineering Manager,Managed Platform Services Job San Francisco area,California USA,IT/Tech

Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack - from electrons to tokens - to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.

We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that - with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.

We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved - people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.

If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.

About the Role:

Join Crusoe as a Senior Engineering Manager and lead a talented team focused on revolutionizing our cloud infrastructure. In this pivotal role, you'll lead the Command Center Insights & Actions team - building the systems that translate raw infrastructure telemetry into human-readable diagnostics and automated remediation workflows. You'll own a technical roadmap spanning alerting engines, heuristic development, node health systems, and state machines that trigger proactive maintenance without impacting customer workloads, while exploring the integration of Large Language Models (LLMs) to build cutting-edge AI solutions within our Command Center product.

This is a full-time opportunity for a passionate leader who thrives on building high-performing teams, fostering innovation, and delivering impactful, data-driven solutions in a dynamic environment.

What You'll Be Working On:

* Drive the Insights & Actions

Roadmap:

Own and execute across alerting infrastructure, control plane APIs, automated action systems, and telemetry-derived insights such as straggler node detection and GPU profiling.

* Influence Strategic Roadmaps:
Contribute significantly to the team's roadmap, impacting long-term team goals and operational performance metrics.

* Refine Early Product Requirements:
Collaborate with product and engineering leadership to bring clarity to ambiguous problems early in the scoping process.

* Collaborate Cross-Functionally:
Partner with product, design, and engineering teams inside and outside the organization to align on goals and deliver integrated solutions.

* Manage Complex Projects:
Lead critical initiatives involving multiple engineers, including those outside your direct report structure, ensuring customer outcomes are auditable and decisions are data-driven.

* Drive Technical Excellence:
Champion process improvements, operational excellence, and best practices across the team.

* Cultivate Team Growth:
Coach and mentor engineers from new grad to Staff level, setting clear performance expectations and defining career paths to build a high-performing, sustainable team.

What You'll Bring to the Team:

* Technical Expertise in Observability & Intelligence Systems:
Hands-on background in ML, heuristics, or rule-based systems - with the ability to engage deeply on problems like anomaly detection, threshold design, and automated remediation logic.

* Proven Leadership:
Demonstrated track record of people management, leading with empathy, and maintaining a sustainable workload for your teams.

* Technical Acumen:
Ability to lead effectively in spaces where problems, opportunities, and strategies are not yet fully defined - driving clarity, direction, and execution.

* Cross-Functional Collaboration:

Excellent technical communication skills, both verbal and written, to work effectively across diverse roles and functions.

* Project Ownership:
Proven experience owning and delivering complex projects end-to-end, with measurable quality and data-driven decision-making.

* Global Scale

Experience:

Background building and operating global services at scale.

* Organizational Prowess:
Highly organized and capable of managing multiple complex initiatives and team priorities in parallel.

Bonus Points

* Background in data platforms and data science

* Background in observability platforms or products

* Familiarity with GPU profiling tools (Nsight, NCCL Inspector) or infrastructure diagnostics at the hardware layer

* Highly motivated and proactive in identifying process improvements and boosting team efficiency

* Passion for coaching and mentoring engineers into high-performing individuals

* Enthusiasm for building team culture with a high quality of life for engineers

* A true "people-person" who thrives in collaborative environments and is energized by teamwork

Benefits:

* Competitive…

Senior Engineering Manager, Managed Platform Services