Senior Director, Data & Analytics Engineering - Fan Genome Platform - MLS
Listed on 2026-01-01
-
IT/Tech
Data Engineer, Data Science Manager
In order to be considered for this role, after clicking "Apply Now" above and being redirected, you must fully complete the application process on the follow-up screen.
OverviewMajor League Soccer (MLS) has built Fan Genome, an advanced 360° fan intelligence platform that unifies demographic, behavioral, and transactional data to deliver hyper-personalization and real-time insights across every fan interaction. We are seeking a hands-on technical leader to own the architecture and evolution of MLS’s next-generation data platform—powering Fan Genome while delivering BI self-service and robust analytics engineering frameworks. This role brings together real-time streaming, distributed compute, open table formats, zero-copy analytics, and enterprise-grade governance to enable advanced analytics and fan engagement at scale.
Responsibilities- Own the technical architecture and feature delivery of MLS’s next-generation cloud-native Lakehouse platform ensuring scalability, performance, and reliability
- Optimize and enhance existing real-time data pipelines built on Apache Kafka, Amazon Kinesis, and Apache Flink to maintain low-latency ingestion and event-driven processing at scale
- Manage and improve distributed compute workflows leveraging Apache Spark for large-scale batch processing, advanced feature engineering, and ML-adjacent workloads
- Oversee and refine open table format implementations (Apache Hudi, Apache Iceberg) to ensure ACID compliance, schema evolution, and efficient incremental processing
- Drive performance tuning and cost optimization for zero-copy analytics using modern distributed, MPP, column-oriented OLAP systems designed for real-time, high-concurrency analytical workloads (e.g., Star Rocks) and query engines like Presto
- Maintain and extend robust data APIs for both batch exports and point (per-fan) queries, integrated with Fan Genome’s feature store
- Advance identity resolution capabilities to ensure accurate, unified fan profiles across multiple data sources
- Establish enterprise-grade governance and security with frameworks such as AWS Lake Formation for cataloging, lineage, and fine-grained access control
- Work with BI team to deliver BI self-service and analytics engineering frameworks, including:
- Designing semantic models, data contracts, and governed data for consistency and trust in reporting
- Building curated wide tables (OBTs) and optimized query layers for high-performance dashboards and ad-hoc analysis
- Implementing data modeling best practices, version-controlled transformations, and automated testing to ensure reliability and scalability
- Build, mentor, and scale a world-class data and analytics engineering team, fostering a culture of technical excellence and innovation
- Bachelor’s degree in Computer Science or a related field required (Master’s preferred)
- 10+ years of progressive experience in data engineering or platform engineering, including 8+ years in leadership roles with a proven track record of delivering production-grade, large-scale data and analytics platforms
Required Skills
- Hands-on expertise in designing, deploying, and optimizing cloud-native data solutions on platforms such as AWS, Azure, or GCP
- Deep understanding of modern data architecture patterns, including Lakehouse design, data mesh principles, and data quality monitoring frameworks
- Demonstrated ability to translate complex business requirements into scalable technical solutions, collaborating with data management, security, and privacy teams to ensure compliance and governance
- Strong computer science fundamentals with proficiency in at least one advanced programming language (Python, Scala, or Java)
- Proven experience with distributed processing frameworks (e.g., Apache Spark, Apache Flink) and real-time streaming architectures
- Expertise in Lakehouse data platforms built on object storage and open table formats (e.g., Apache Hudi, Apache Iceberg) for ACID transactions, schema evolution, and incremental processing
- Proficiency in Infrastructure-as-Code, orchestration, transformation frameworks, containers, and observability tools
- Familiarity with data science and machine learning workflows, including feature…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).