Lead AI SRE/AI Ops Engineer
Listed on 2026-05-16
-
IT/Tech
SRE/Site Reliability, Cloud Computing, IT Support
Headquartered in California, U.S.A., GSPANN provides consulting and IT services to global clients. We help clients transform how they deliver business value by helping them optimize their IT capabilities, practices, and. With five global delivery centers and 2000+ employees, we provide the intimacy of a boutique consultancy with the capabilities of a large IT services firm.
Role:Lead AI SRE/ AI Ops Engineer
Location:
Fremont, CA (Hybrid Onsite) Duration: 12+ Months Role Summary
We are looking for a strong hands‑on Lead AI-Assisted SRE / AIOps Engineer to help operationalize and scale an SRE agent‑driven operations model. This role will lead the onboarding of existing scripts, SOPs, and operational workflows into the SRE agent while also supporting production releases, validation, incident response, and operational governance.
This is not a pure support role. The ideal candidate must be technically strong, practical, and capable of using independent judgment rather than relying blindly on AI outputs.
Experience- 5-7 years of hands‑on experience in IT operations, cloud operations, SRE, platform support, or production engineering
- Proven experience in production support, incident handling, automation, and operational troubleshooting
- Experience working with monitoring, observability, scripting, and release validation
- Exposure to AIOps, AI‑assisted operations, or automation‑led support models is strongly preferred
- Lead the adoption and operationalization of the SRE agent across support and reliability workflows
- Translate existing scripts, runbooks, SOPs, and operational knowledge into agent‑compatible workflows
- Work with teams to identify which use cases should be automated, semi‑automated, or remain human‑driven
- Validate agent outputs, recommendations, and remediation steps before operational use
- Support production releases, release validation, smoke testing, and post‑release health checks
- Drive troubleshooting during incidents and ensure proper root cause analysis and follow‑through
- Improve alert handling, event correlation, and operational response patterns
- Coordinate with engineering, operations, and platform teams on onboarding and process changes
- Mentor junior engineers and guide them on workflow design, validation, and operational execution
- Maintain high‑quality documentation, runbooks, and operational standards
- Strong hands‑on scripting experience in Power Shell, Python, Shell/Bash
- Experience with monitoring, alerting, logs, dashboards, and incident workflows
- Good understanding of production support processes, release support, and validation practices
- Experience with cloud platforms, preferably Azure
- Familiarity with ITSM/ticketing tools such as Service Now, Jira, or similar
- Ability to understand existing operational scripts and modernize them into scalable workflows
- Experience with APIs, integrations, or automation pipelines is preferred
- Exposure to Kubernetes / AKS/AI tools - ChatGPT, copilot is a plus
GSPANN is a diverse, prosperous, and rewarding place to work. We provide competitive benefits, educational assistance, and career growth opportunities to our employees. Every employee is valued for their talent and contribution. Working with us will give you an opportunity to work globally with some of the best brands in the industry.
The company does and will take affirmative action to employ and advance in the employment of individuals with disabilities and protected veterans and to treat qualified individuals without discrimination based on their physical or mental disability status. GSPANN is an equal opportunity employer for minorities/females/veterans/disabled.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).