Big Data Support Engineer Lead - Vice President
Job in
Irving, Dallas County, Texas, 75084, USA
Listed on 2026-05-30
Listing for:
Citi
Full Time
position Listed on 2026-05-30
Job specializations:
-
IT/Tech
IT Support, Systems Engineer, Cloud Computing, IT Project Manager
Job Description & How to Apply Below
The Big Data Support Engineer Lead is a strategic professional who stays abreast of developments within the field and contributes to directional strategy by considering their application in the role and the business. Recognized as a technical authority for an area within the business, the role requires basic commercial awareness, strong communication and diplomacy skills to guide, influence and convince colleagues across different areas and external customers, and significant impact through complex deliverables.
The Lead provides advice and counsel related to technology or operations and impacts an entire area that ultimately affects overall performance and effectiveness of the sub‑function or job family.
- Has a strong understanding and experience in leading all aspects of Incident Management, Problem Management, Service Improvements, Monitoring and Observability instrumentation, SRE (Site Reliability Engineering) frameworks and adoption, disaster recovery and resiliency, and automation of production services.
- Leads the production monitoring and implementation of observability using AppD, Splunk, Grafana and strong knowledge of monitoring tools used in the industry.
- Collaborates with development, architecture and infrastructure teams and leads service improvement plans.
- Supports the delivery of the L2 Service Delivery and SRE objectives for the business/region.
- Leads the team and contributes towards achievement of service performance against targets for the organization.
- Has a strong bias towards automation and using SRE frameworks.
- Responsible for execution of day‑to‑day service delivery functions including the following:
- Incident Management:
Performs incident triage, root cause analysis, and collects and validates business impact. - Service Management:
Collaborates with the Technology Organization and manages service risk/maturity assessments and drives service improvement plans. - Knowledge Management:
Develops/tests knowledge objects to support increased L0, L1, and L2 resolution. - Change Management:
Reviews and approves changes. - Capacity Management:
Reviews capacity across service components. - Continuity Management:
Schedules and facilitates COB testing, maintains recovery plans. - Configuration Management:
Builds/updates service configuration. - Third‑Party Asset Management:
Manages 3rd‑party asset management (licensing compliance/optimization). - Service Readiness:
Reviews major releases and new application installations from the early stage of the project/program to ensure risks are documented and remediated before production go‑live. - Service Risks:
Identifies, documents and manages service risks within applications and effectively manages their resolutions. - Monitoring:
Collaborates and engages with various teams to enable monitoring/observability of production services.
- At least 10+ years of hands‑on overall IT experience, of which 2 or more years in one or more cloud technologies running services on Open Shift, AWS or Google Cloud.
- Good understanding of the Data Engineering function, role and tools used in technologies such as Ab Initio, Big Data, Master Data Management (MDM) and hybrid cloud.
- Strong knowledge of using CI/CD tools for automated code deployments.
- Strong knowledge of SOAP, REST APIs and microservices.
- Knowledge of creating observability dashboards using Splunk, App Dynamics, ELK and Grafana.
- Working knowledge of Ansible scripts for automation.
- Track record of successfully triaging issues and driving them to resolution.
- Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements.
- Expectation to be available “on call” or “shift basis” for off‑hours production support.
- Can handle multiple, competing priorities simultaneously.
- Ability to work with offshore and onsite production support teams across multiple organizations.
- Excellent oral and written communication skills.
- Good knowledge of disaster recovery processes across data centers.
- Strong analytical and problem‑solving skills and ability to logically break down tasks into smaller manageable parts.
- Strong individual with the ability to communicate and negotiate at all levels and…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×