Systems Operations Manager - Data Platforms -Teradata & Hadoop Job Irving area,Texas USA,IT/Tech

Systems Operations Manager - Data Platforms (Teradata & Hadoop)

Wells Fargo is back in the office collaborating for fabulous outcomes.

This role is in the office three days a week.

No visa sponsorship or visa transfers.

About this role:

Wells Fargo is seeking a Systems Operations Manager to lead the end-to-end support and operations of enterprise Teradata and Hadoop data platforms powering large-scale analytics and business decisioning.

This role is accountable for platform stability, reliability, and operational excellence across a complex, multi-tenant ecosystem supporting 100+ tenants. The manager will lead a 24x7 operations team, apply Site Reliability Engineering (SRE) principles, and drive automation-led transformation to ensure predictable, resilient service delivery at scale.

This is a hands-on leadership role requiring strong execution discipline, ownership, and the ability to operate in a high-risk, regulated environment, ensuring SLA adherence, compliance, and business continuity outcomes.

In this role, you will:

Operational Leadership & Platform Ownership

* Lead end-to-end platform operations for Teradata and Hadoop environments, ensuring availability, performance, and resilience

* Provide clear ownership and accountability for production services, operational outcomes, and service stability

* wel Drive incident, problem, and change management, including major incident command and recovery leadership

* Lead 24x7 global support operations, including on-call governance and escalation management

Operational Excellence & Service Performance

* Own and drive SLA/OLA adherence, uptime, and service health metrics

* Lead capacity management, performance tuning, and proactive issue prevention initiatives

* Establish and enforce operational standards, runbooks, and service management practices

* Drive root cause analysis (RCA) and long-term remediation of systemic issues

* Drive adoption of automation, observability, and AIOps practices to reduce manual toil and improve MTTR.

Governance, Risk & Compliance

* Ensure alignment with enterprise risk, compliance, and change management frameworks

* Drive patching, vulnerability remediation, and platform security posture

* Maintain audit readiness, documentation quality, and control adherence

* Identify, escalate, and mitigate operational and platform risks

Multi-Tenant Platform Operations

* Manage operations across shared, multi-tenant platforms, ensuring workload isolation and stability

* Oversee resource allocation, scheduler configuration, and workload prioritization

* Execute in high-risk production environments where changes impact multiple tenants simultaneously

Site Reliability Engineering (SRE) & Automation

* Apply SRE principles to improve reliability, availability, and scalability of data platforms

* Drive automation-first operations to eliminate manual toil and standardize service delivery

* Implement and enhance observability, monitoring, and self-service capabilities

* Partner with engineering teams to improve platform reliability, operability, and service maturity

* Drive adoption of automation, observability, and AIOps practices to reduce manual toil and improve MTTR.

Stakeholder Engagement & Execution Alignment

* Partner with Engineering, CIO-aligned teams, Cybersecurity, and LOB stakeholders

* Provide clear, executive-ready communication on platform health, risks, and priorities

* Drive cross-functional accountability and execution discipline across teams

People Leadership & Talent Development

* Lead, coach, and develop a team of Systems Operations engineers and analysts

* Build a culture of ownership, accountability, and operational excellence

* Manage resource allocation, workforce planning, and vendor/partner support

* Develop team capabilities in SRE practices, automation, and platform operations maturity

Resiliency & Business Continuity

* Ensure resiliency posture across Teradata and Hadoop platforms, including:

* Disaster recovery (DR) readiness and execution

* RTO/RPO alignment and validation

* Continuous improvement of recovery capabilities

* Lead BCP execution and failover coordination for critical platforms

Required Qualifications:

* 5+ years of Systems Engineering, and Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education

* 2+ years of Leadership experience

* Hands-on experience with:

* Teradata and Hadoop platforms

* Distributed systems and data platform operations

* Incident, problem, and change management processes

Desired

Qualifications:

* Experience supporting enterprise-scale Teradata and Hadoop platforms

* Demonstrated leadership in 24x7 production support and SRE environments

* Strong experience in:

* Automation, AIOps, and operational transformation

* Dev Sec Ops and CI/CD practices

* Observability, monitoring, and platform telemetry

* Familiarity with Kubernetes, containerization, and cloud-native architectures

* Strong understanding of:

* Multi-tenant data platforms and…