Data Science Engineer
Listed on 2026-02-07
-
IT/Tech
Data Engineer, Machine Learning/ ML Engineer
Job Title:
Data Science Engineer Location:
Boca Raton, FL-Remote/Hybrid
About Us At Predictive Sales AI (PSAI), we’re redefining how technology and intelligence transform digital marketing. Our AI-powered software enables home services businesses to make smarter, faster decisions—fueling growth through automation, prediction, and precision.
We are seeking a Data Science Engineer with strong data engineering and MLOps expertise to build scalable, production-grade ML and data platforms that directly impact customer growth and retention.
Job OverviewAs a Data Science Engineer
, you will design and operate the data machine learning foundations behind PSAI’s predictive products. You will build scalable pipelines and robust warehouse/lakehouse models across CRM, marketing, product events, and external datasets — ensuring reliability, accuracy, and business continuity at scale.
- 4 years in data-centric engineering
- Proven experience deploying ML models via pipelines
- Deep expertise in Python, SQL, and Azure infrastructure
- Architectural ownership through data contracts and resilient modeling
- Build scalable batch and near-real-time ingestion pipelines using Azure Data Factory, APIs, event streams, and external connectors.
- Develop ML-ready datasets across CRM, marketing automation platforms, product telemetry, and geospatial data sources.
- Design performant, well-modeled warehouse/lakehouse systems in Azure Synapse or Databricks.
- Train and deploy predictive models (lead scoring, churn prediction, forecasting) through reproducible pipelines.
- Build time-aware, leakage-resistant feature pipelines for production ML use cases.
- Support full MLOps lifecycle using Azure Machine Learning, including experiment tracking, model registry, and deployment.
- Implement automated validation, anomaly detection, reconciliation, and monitoring for pipelines and warehouse models.
- Design and enforce data contracts to prevent upstream schema changes from breaking downstream ML workflows.
- Own pipeline SLAs, alerting, incident response, and durable improvements through postmortems.
- Optimize processing for very large datasets (>100GB) through partitioning, incremental loads, distributed compute, and query tuning.
- Improve cost efficiency across compute/storage in Azure environments.
- Maintain clean, testable, production-ready Python codebases using:
- Object-oriented patterns
- Type hinting
- CI/CD workflows via Azure Dev Ops
- Package models and pipelines using Docker for consistent deployment across dev/staging/prod.
- Communicate architectural trade-offs and technical debt in business terms to Product, Rev Ops, and leadership.
- Partner with Engineering on instrumentation and scalable data integration.
- Mentor junior engineers through pairing, code reviews, and documentation best practices.
We are looking for an individual who is organized, proactive, and detail-oriented. In this role, you will work closely with teams across the company. Here’s what we’re looking for:
- Ownership mindset with a reliability-first approach
- Strong SQL/Python and a high attention to data quality
- Scales systems thoughtfully (performance/cost aware, maintainable designs)
- Collaborative communicator across engineering, Rev Ops, and analytics
- Documents well and supports others through reviews/mentorship
- Preferred Master’s degree in Data Science, Computer Science, Statistics, Engineering, or a closely related quantitative field.
- 4 years in data engineering, ML engineering, or data platform development.
- Minimum 2 years deploying ML models into production workflows.
- Experience building pipelines and warehouse systems at scale (>100GB datasets).
- Demonstrated adaptability in fast-changing technical and business environments.
- Python (Expert): pandas, polars, scikit-learn;
PyTorch, transformers; production engineering (OOP, testing, typing) - SQL (Expert): advanced analytics, recursive CTEs, query tuning, Azure Synapse optimization
- Azure Data & ML Stack:
Data Factory (ETL/ELT), Azure ML (MLOps), Key Vault, Databricks/Spark, Docker deployment - Distributed & Large-Scale Compute:
Spark, Ray, Dask; GPU acceleration with RAPIDS (plus) - Geospatial…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).