Principal AI Site Reliability Engineer, EI Production Services
Listed on 2026-04-29
-
IT/Tech
SRE/Site Reliability, Cloud Computing, Systems Engineer
Principal AI Site Reliability Engineer, EI Production Services
Note:
Fidelity will not provide immigration sponsorship for this position.
The EI Production Services organization at Fidelity is seeking a strategic and proactive Principal AI Site Reliability Engineer (SRE). In this role, you will drive operational excellence, observability, and intelligent automation for mission‑critical contact center applications supporting Wealth and Workplace Investing business units. You will lead efforts to reduce manual toil, enhance associate experience, and improve system reliability by leveraging AI‑driven automation and industry best practices.
Your work will transform the support model for critical contact center applications, reducing downtime and improving associate productivity by enabling faster triage, improved resiliency, and a superior experience for associates and customers.
- Lead initiatives to advance observability, automation, and operational efficiency for critical associate‑facing applications.
- Drive proactive monitoring and AI‑powered telemetry to minimize reactive incident response and accelerate resolution.
- Collaborate with engineering and business leaders to prioritize and resolve issues impacting associate experience.
- Implement automation and self‑service capabilities to reduce manual intervention and improve reliability.
- Establish and track SLIs/SLOs to measure and optimize system performance.
- Communicate progress, outcomes, and technical concepts clearly to senior leadership and stakeholders.
- 10+ years in technology operations, systems engineering, or production support leadership.
- Proven ability to deliver complex improvement initiatives in large‑scale, high‑availability environments.
- Deep expertise in IT Service Management (ITSM), incident/problem management, and operational process optimization.
- Advanced knowledge of observability and monitoring tools (OTEL, Splunk, Data Dog, Prometheus, Grafana).
- Experience leveraging AI and automation to drive efficiency and reliability.
- Proficiency in scripting and automation (Python, Bash, Power Shell, or similar).
- Strong understanding of On‑Prem and Public Cloud (AWS/Azure/GCP) environments.
- Familiarity with networking, load balancing, and security fundamentals.
- Agile and Dev Ops mindset with experience in CI/CD and operational automation.
- Exceptional communication, collaboration, and stakeholder management skills.
- Data‑driven approach to problem‑solving and progress tracking.
- Leadership excellence: ability to inspire, mentor, and guide teams toward operational excellence.
- Optional certifications: ITIL, AWS, SRE‑related credentials.
Category:
Information Technology
Most roles at Fidelity are hybrid, requiring associates to work onsite every other week (all business days, M‑F) in a Fidelity office. This does not apply to remote or fully onsite roles.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).