Data Engineer/ML Engineer
Listed on 2026-02-16
-
IT/Tech
Data Engineer, Machine Learning/ ML Engineer, Data Science Manager, AI Engineer
Job Title:
Data Engineer/ML Engineer
Department: Applied AI & Data Engineering
Type: Full-time (FTE)
Reports to: Head of Applied AI & Data Engineering
This role requires a minimum of four (4) days per week working onsite at EnStream’s head office in Toronto; this requirement may be changed at management’s discretion.
Who is En StreamEnStream is a leader in secure digital identity and mobile data intelligence, working to advance the future of digital trust in Canada. We build innovative data-driven models that enhance the integrity, reliability, and safety of digital identity ecosystems. Our latest initiative leverages advanced data science
, machine learning
, and deep learning to further grow and sustain digital trust across Canada.
Our mission is to empower frictionless trust in every interaction. EnStream is dedicated to increasing trust and convenience for Canadians using real-life, verified identities and network data held by trusted telco networks. At EnStream, every team member plays a critical role in shaping our strategy and delivering meaningful impact across industries.
About the RoleWe’re hiring a hands‑on Data & ML Engineer to help build and scale the EnStream Trust Platform’s data platform and machine learning pipelines. You’ll design robust data and ML pipelines across internal and partner data sources, with a strong focus on production readiness, observability, and repeatability. The data and ML pipelines you’ll build and support span tabular and graph features and AI/ML models, using unsupervised and semi‑supervised approaches for anomaly detection, clustering, and risk scoring.
WhatYou’ll Do
- Design and implement the EnStream Trust Platform’s data platform on AWS, including ingestion, data quality, error management, data value, data flow, data security design patterns, curated/feature‑ready datasets, and governed access layers
- Build and maintain scalable ETL/ELT pipelines (batch and/or streaming as needed) with strong data quality controls (schema checks, validation rules, reconciliations) and clear lineage/metadata
- Develop production‑grade data pipelines for both tabular and graph signals, supporting unsupervised and semi‑supervised learning workflows
- Implement end‑to‑end observability for data and ML pipelines: logging, metrics, tracing, alerting, and dashboards for pipeline health, data quality, latency, and model performance/drift where applicable
- Establish engineering best practices for reliability and handoff: versioned code and datasets, configuration‑driven runs, CI/CD for pipelines, and runbooks for operations and incident response
- Partner with product and external partners to align on data contracts, delivery cadence, and measurable outcomes
- Hands‑on AWS experience across data engineering and ML engineering (e.g., S3, Glue/Athena/EMR, Redshift, Sage Maker), including orchestration and monitoring
- Strong Python (PySpark and/or pandas) and SQL, with a track record of building reliable, maintainable data pipelines and feature datasets
- Hands‑on experience engineering data and ML pipelines on AWS (e.g., S3, Glue/Athena/EMR, Redshift, Step Functions, Sage Maker), including orchestration and cost/performance considerations
- Proven ability to implement observability for pipelines (data quality monitoring, metrics/logging, alerting, dashboarding) and operate services in production
- Experience supporting ML workflows end‑to‑end (data/feature generation, training/scoring pipelines, reproducible environments, and configuration/parameter traceability)
- Exposure to both tabular and graph data modeling contexts, including unsupervised and/or semi‑supervised approaches used to generate risk/anomaly/clustering signals
- Prior data science experience (or strong applied analytics background) to help validate assumptions and interpret model outputs with stakeholders.
- Familiarity with modern MLOps tooling and patterns (experiment tracking, model registry, CI/CD for ML, infrastructure as code).
- Experience with graph analytics/graph ML frameworks (e.g., Network
X, PyG, DGL) and/or graph databases (e.g., Neptune, Neo4j). - Experience with streaming data systems and event‑driven pipelines (e.g., Kinesis, Kafka).
- Experience with containerized workloads and orchestration (Docker, Kubernetes/EKS) and infrastructure automation
- Contribute to a national‑scale initiative defining the future of digital trust in Canada
- Work on cutting‑edge fraud detection applications using real‑world identity data
- Collaborate with a highly skilled, cross‑functional team
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: