Senior Manager AI Reliability Operations Job Morrisville North Carolina USA,IT/Tech

General Information

Req #: WD

Career area:
Software Engineering

Country/Region:
United States of America

State:
North Carolina

City:
Morrisville

Date:
Monday, March 2, 2026

Working time:
Full-time

Additional Locations:
United States of America - Illinois - Chicago

Why Work at Lenovo

We are Lenovo. We do what we say. We own what we do. We WOW our customers.

Lenovo is a US $69 billion revenue global technology powerhouse, ranked #196 in the Fortune Global 500, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver Smarter Technology for All, Lenovo has built on its success as the world's largest PC company with a full-stack portfolio of AI-enabled, AI-ready, and AI-optimized devices (PCs, workstations, smartphones, tablets), infrastructure (server, storage, edge, high performance computing and software defined infrastructure), software, solutions, and services.

Lenovo's continued investment in world-changing innovation is building a more equitable, trustworthy, and smarter future for everyone, everywhere. Lenovo is listed on the Hong Kong stock exchange under Lenovo Group Limited (HKSE: 992) (ADR: LNVGY).

This transformation together with Lenovo's world-changing innovation is building a more inclusive, trustworthy, and smarter future for everyone, everywhere. To find out more visit , and read about the latest news via our Story Hub.

Description and Requirements

About Our Team

Lenovo is building Quantum, a next generation hybrid AI platform that spans Windows, Android, and cloud. As part of this initiative, we are expanding the Qira organization - Lenovo's cross device Personal AI that works seamlessly across Lenovo and Motorola products.

We are seeking a Senior Manager, AI Reliability Operations to lead the operational backbone that keeps Qira safe, stable, performant, and continuously improving. This leader will own our Operations pillar within the Qira SRE organization, responsible for oncall excellence, incident response, AI change safety, deployment reliability, and production governance across device, edge, and cloud environments.

This is a high impact leadership role shaping how Qira operates at global scale.

Location:

Open to remote work in the US. The preferred work location is Chicago, IL.

What You'll Do Operational Leadership

Lead and scale the Operations pillar within Qira SRE, including oncall/NOC, incident management, deployments, and operational readiness.
Drive operational excellence for Qira's hybrid AI systems across ondevice, edge, and cloud environments.
Establish a worldclass followthesun oncall model, ensuring rapid detection, response, and recovery from incidents.

Incident & Crisis Management

Own incident response, including command, coordination, communications, and postincident analysis.
Create a culture of blameless postmortems and continuous learning.
Build automation, runbooks, and tooling that dramatically reduce MTTR and operational toil.

AI Deployment & Change Safety

Own the AI change management lifecycle for model, prompt, retriever, index, and policy updates.
Implement safe rollout mechanisms including shadow testing, canarying, evaluation gates, and automated rollback policies.
Ensure every production change meets reliability, safety, and auditability standards.

Operational Governance

Own operational frameworks including runbook requirements, change controls & ITSM, incident taxonomies, operational readiness reviews, reliability signoff for launches, operational governance frameworks.
Partner with Security, Compliance, and Product Safety on runtime policy enforcement and operational safeguards.

Cross Functional Partnership

Partner with AI/ML, Platform, Firmware, Dev Ops, and Product teams to ensure reliability and operational criteria are built into every release.
Collaborate closely with Observability, Service Reliability Engineering, and AI Reliability pillars in a unified reliability mission.
Advocate for and help prioritize operational improvements across the engineering ecosystem.

Team & Talent Leadership

Hire, mentor, and grow a high performing global team of SREs, Dev Ops engineers, and incident specialists.
Foster a culture of accountability, collaboration, and operational craftsmanship.
Define career paths and leadership opportunities for reliability operations staff.

Basic Qualifications

10+ years in Site Reliability Engineering, Production Engineering, Dev Ops, or large scale operations, including 3+ years leading teams.
Bachelor's Degree in Computer Science, Engineering, or related technical field.
Experience running mission critical oncall operations for distributed systems.
Deep knowledge of incident management, crisis response, and postmortem practices.
Handson experience with CI/CD pipelines, deployments, and change management.
Experience operating systems in cloud environments (AWS, Azure, GCP).
Strong understanding of Linux systems, networking, and distributed system fundamentals.
Excellent leadership, communication, and cross functional alignment skills.

Preferred…


Increase/decrease your Search Radius (miles)



Job Posting Language