Senior AI & Data Engineering Lead - Senior Vice President Job Jersey City area,New Jersey USA,IT/Tech

This job description outlines a senior-level role for a data architect or lead data engineer within a Data Services team. The position is centered on building and managing the data infrastructure required to support large-scale generative AI and machine learning initiatives.

Expanded Responsibilities Strategic AI Enablement

This goes beyond just building databases; it’s about designing the entire data foundation for the company’s AI strategy.

Data Ecosystem Architecture:
- Data Lake/Lakehouse Design: Implementing a central repository to store vast amounts of structured, semi-structured, and unstructured data from various sources. Technologies include AWS S3, Azure Data Lake Storage, or Google Cloud Storage.
- Federated Querying: Leveraging technologies like Starburst (commercial Trino) to create a virtual data warehouse. This allows data consumers to query data across different sources with a single SQL query, without needing to move or copy the data.
- Scalability and Performance: Ensuring the architecture can scale horizontally to handle petabytes of data and a high volume of concurrent queries, critical for pre-training large language models.

Advanced AI Ops & Data Pipelines

High-Throughput Data Pipelines:
- Batch Processing:
  Using Apache Spark for large-scale data transformation, cleaning, and feature engineering on historical data.
- Real-time Stream Processing:
  Using Apache Kafka as a messaging bus to ingest real-time data. Apache Flink is used for complex event processing on these streams.
Optimization and Reliability:
- Low Latency:
  Tuning jobs and infrastructure to minimize the time data travels from source to destination.
- High Availability:
  Implement failover mechanisms, monitoring, and alerting to ensure pipelines are always running.
- CI/CD for Data:
  Implementing Dev Ops and AI Ops best practices for data pipelines, including automated testing, deployment, and data quality checks.

AI Governance & Leadership

Data Governance for AI:
- Data Quality:
  Implement automated checks and monitoring to ensure data is accurate, complete, and consistent.
- Data Provenance & Lineage:
  Create systems to track where data comes from, how it has been transformed, and how it is used.
- Data Security:
  Work with security teams to implement access controls, data masking, and encryption to protect sensitive information.
Team Leadership and Mentorship:
- Mentor Data Engineers:
  Guide junior and mid-level engineers, conduct code reviews, and establish best practices for the team.
- Foster Innovation:
  Stay up-to-date with technologies and encourage a culture of experimentation and continuous improvement.
- Cross-functional Collaboration:
  
  Work closely with data scientists, ML engineers, platform engineers, and business stakeholders to understand their needs and deliver effective data solutions.

Qualifications

10+ years of relevant experience
Experience in implementing projects
Experience in systems analysis and programming of software applications
Demonstrated Subject Matter Expert (SME) in area(s) of Applications Development
Demonstrated knowledge of client core business functions
Demonstrated leadership, project management, and development skills
Relationship and consensus building skills

Education

Bachelor’s degree/University degree or equivalent experience
Master’s degree preferred

Required Skills Big Data Technologies

Processing Frameworks:
Expert-level knowledge of Apache Spark, strong experience with Apache Flink and Apache Kafka.
Query Engines:
Deep understanding and hands‑on experience with Trino (Starburst).
Orchestration:
Experience with workflow management tools like Airflow or Prefect.

Data Architecture

Data Modeling:
Strong understanding of data modeling concepts for analytical and operational systems.
Platform Design:
Proven experience designing and building scalable data lakes, data warehouses, and lakehouse architectures.
Cloud Expertise:
Proficiency with at least one major cloud provider (AWS, GCP, Azure) and their data services.

Governance & Security

Data Governance:
Experience implementing data quality frameworks, data lineage solutions, and data cataloging tools.
Security:
Knowledge of data security best practices, encryption, masking, and role-based…