Senior AI & Data Engineering Lead - Senior Vice President
Listed on 2026-06-22
-
IT/Tech
Data Engineering
Senior‑level role for a data architect or lead data engineer within a Data Services team. The position is centered on building and managing the data infrastructure required to support large‑scale Generative AI and Machine Learning initiatives. Below is a detailed breakdown of the responsibilities and the skills required for such a role.
Expanded ResponsibilitiesThis role combines deep technical expertise in data engineering with strategic thinking and leadership. The core responsibilities can be broken down into three main pillars:
1. Strategic AI EnablementThis goes beyond just building databases; it is about designing the entire data foundation for the company's AI strategy.
- Data Ecosystem Architecture: Responsible for the high‑level design of the data platform. Includes:
- Data Lake / Lakehouse Design: Implementing a central repository to store vast amounts of structured, semi‑structured, and unstructured data from various sources using technologies such as AWS S3, Azure Data Lake Storage, or Google Cloud Storage.
- Federated Querying: Leveraging technologies such as Starburst (commercial Trino) to create a virtual data warehouse. This allows data consumers—including analysts, data scientists, and AI models—to query data across different sources with a single SQL query, without needing to move or copy the data.
- Scalability and Performance: Ensuring the architecture can scale horizontally to handle petabytes of data and a high volume of concurrent queries, which is critical for pre‑training large language models (LLMs).
This is the hands‑on engineering aspect of the role, focused on the movement and processing of data.
- High‑Throughput Data Pipelines: Lead the development of the data "plumbing" that powers the AI systems. Includes:
- Batch Processing: Using Apache Spark for large‑scale data transformation, cleaning, and feature engineering on historical data.
- Real‑time Stream Processing: Using Apache Kafka as a messaging bus to ingest real‑time data from sources such as application logs, IoT devices, or click streams. Apache Flink is used for complex event processing on these streams (e.g., fraud detection, real‑time recommendations).
- Optimization and Reliability: Pipelines must be fast and resilient. Includes:
- Low Latency: Tuning jobs and infrastructure to minimize the time it takes for data to travel from source to destination.
- High Availability: Implementing failover mechanisms, monitoring, and alerting to ensure that data pipelines run continuously and AI models always have access to fresh data.
- CI/CD for Data: Implementing Dev Ops and AI Ops best practices, including automated testing, deployment, and data quality checks.
This pillar focuses on the "people" and "process" aspects of the role, ensuring data is used responsibly and effectively.
- Data Governance for AI: Establish frameworks for:
- Data Quality: Implementing automated checks and monitoring to ensure data is accurate, complete, and consistent.
- Data Provenance & Lineage: Creating systems to track where data comes from, how it has been transformed, and how it is used—essential for debugging models and regulatory compliance.
- Data Security: Working with security teams to implement access controls, data masking, and encryption to protect sensitive information, especially in AI training contexts.
- Team Leadership and Mentorship: The role involves:
- Mentor Data Engineers: Guide junior and mid‑level engineers, conduct code reviews, and establish best practices for the team.
- Foster Innovation: Stay up‑to‑date with the latest technologies and methodologies in data and AI, encouraging experimentation and continuous improvement.
- Cross‑functional
Collaboration:
Work closely with data scientists, ML engineers, platform engineers, and business stakeholders to understand needs and deliver effective data solutions.
- 10+ years of relevant experience
- Experience in implementing projects
- Experience in systems analysis and programming of software applications
- Demonstrated Subject Matter Expert (SME) in Applications Development
- Demonstrated knowledge of client core business functions
- Demonstrated leadership,…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).