Manager, Production Support Automation
Listed on 2026-06-01
-
IT/Tech
SRE/Site Reliability, Systems Engineer
Overview
Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like one of the world’s most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality solutions designed to enhance lives and delight those we serve. We’re looking for talented team members who want to Dream. Do. Grow.
with us. An important part of the Toyota family is Toyota Financial Services (TFS), the finance and insurance brand for Toyota and Lexus in North America. While TFS is a separate business entity, it is an essential part of this world-changing company—delivering on Toyota's vision to move people beyond what's possible. At TFS, you will help create best-in‑class customer experience in an innovative, collaborative environment.
We’re Looking For
This role requires a strong blend of deep technical expertise, operational rigor, and people leadership. The Operations Manager acts as the technical authority for operational automation and reliability while leading teams that partner closely with Engineering, SRE, Infrastructure, and Service Management. You will be responsible for ensuring production stability and reliability while driving the operational automation strategy to enhance efficiency. They will lead efforts in observability and monitoring to proactively identify and resolve issues.
Excellence in managing incidents, problems, and changes is essential to maintain seamless operations. Additionally, the role requires strong team leadership skills focused on building capability and fostering continuous development within the team.
- Design and implement automated health checks, self‑healing workflows, runbook automation, and zero‑touch operations.
- Define SLAs, SLOs, SLIs, lead incident response, drive RCA automation outcomes, and partner with engineering teams to embed operational readiness.
- Experience with Automation & Scripting:
Python, Bash, Power Shell - Proficiency with Automation Tooling: UI Path (RPA)
- Experience with Automation Tools:
Ansible, Puppet, Chef, Jenkins, CI/CD - Proficiency with Monitoring:
Dynatrace, Grafana, Open Telemetry - Experience with Operations:
Incident, Problem, Change Management, ITIL, SRE concepts - Successfully reduced manual operational work by implementing automation tools and streamlined processes, resulting in increased team efficiency and reduced human error.
- Achieved significant improvements in Mean Time to Recovery (MTTR) and overall incident reduction through proactive incident management and root cause analysis.
- Increased the rate of auto‑remediation by designing and deploying intelligent automated workflows that resolve common issues without manual intervention.
- Enhanced monitoring signal quality by refining alert thresholds and integrating advanced observability tools, leading to more accurate and actionable insights for faster issue detection.
Applicants for our positions are considered without regard to race, ethnicity, national origin, sex, sexual orientation, gender identity or expression, age, disability, religion, military or veteran status, or any other characteristics protected by law.
Please send an email to .
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).