Site Reliability Engineer; Collaboration Engineering
Listed on 2026-06-19
-
IT/Tech
Cloud Computing: Infrastructure & Operations, Systems Engineer, IT Support
Staff Site Reliability Engineer (Collaboration Engineering)
- Full-time
- Business Segment:
Operations & Technology
NBCUniversal is one of the world's leading media and entertainment companies. We create world-class content, which we distribute across our portfolio of film, television, and streaming, and bring to life through our global theme park destinations, consumer products, and experiences. We own and operate leading entertainment and news brands, including NBC, NBC News, NBC Sports, Telemundo, NBC Local Stations, Bravo, and Peacock, our premium ad‑supported streaming service.
We produce and distribute premier filmed entertainment and programming through our powerhouse film and television studios, including Universal Pictures, Dream Works Animation, and Focus Features, and the four global television studios under the Universal Studio Group banner, and operate industry‑leading theme parks and experiences around the world through Universal Destinations & Experiences, including Universal Orlando Resort, home to Universal Epic Universe, and Universal Studios Hollywood.
NBCUniversal is a subsidiary of Comcast Corporation. Visit for more information.
Our impact is rooted in improving the communities where our employees, customers, and audiences live and work. We have a rich tradition of giving back and ensuring our employees have the opportunity to serve their communities. We champion an inclusive culture and strive to attract and develop a talented workforce to create and deliver a wide range of content reflecting our world.
The Staff Reliability Engineer (SRE) for Workplace Engineering is responsible for the reliability, performance, security, and operational excellence of enterprise workplace collaboration & endpoint services used globally by employees and partners. This role applies an engineering mindset to operations—defining service level indicators/objectives (SLIs/SLOs), reducing toil through automation, improving observability, and strengthening incident response—to ensure a consistent, high‑quality collaboration experience across messaging, meetings, voice, file sharing, knowledge sharing, device management platforms & Copilot/AI engineering.
- Microsoft 365:
Teams (chat, meetings, webinars, Teams Phone), SharePoint Online, One Drive, Exchange Online, Microsoft Entra (Azure AD), Microsoft Purview, Defender for Office 365, Intune (Endpoint Management). - Hybrid messaging and identity integrations (as applicable):
Exchange Server, directory synchronization, mail flow and routing - Collaboration endpoints and devices:
Teams Rooms, certified headsets/cameras, conference room AV integrations - Ecosystem integrations:
Power Platform (Power Automate/Apps), Graph API, third‑party conferencing/messaging where in use (e.g., Zoom/Slack), mail hygiene/security gateways - Architect and optimize global Microsoft Intune and Jamf Pro environments.
- Orchestrate Windows Updates for Business (WUfB), third‑party application patching, and compliance policies to maintain a hardened security posture
- Automated packaging and deployment of Windows applications, maintaining a rigorous cadence for third‑party updates.
- Leverage Power Shell and Graph API to automate repetitive configuration tasks and self‑healing remediations.
- Partner with Security Operations to remediate vulnerabilities.
- Develop and enforce Configuration Profiles, Compliance Policies, and Conditional Access rules.
- Own the reliability and scaling of Azure Virtual Desktop (AVD) and Windows 365 (Cloud PC), optimizing for both performance and cost‑efficiency.
- Define and operationalize SLIs/SLOs and error‑budget policies for collaboration services (Teams chat/meetings/voice, SharePoint/One Drive, Exchange) with clear customer‑impact measurements.
- Own end‑to‑end reliability engineering: capacity planning, performance tuning, resilience reviews, dependency mapping, and proactive risk reduction for critical collaboration journeys.
- Demonstrated expertise in developing, operationalizing, and scaling AI engineering capabilities, including platform design, model lifecycle management, automation, reliability, and enterprise adoption.
- Strong knowledge of AI governance frameworks, with…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).