Data Science Engineer Job Boca Raton area,Florida USA,IT/Tech

Job Title:
Data Science Engineer Location:
Boca Raton, FL-Remote/Hybrid

About Us At Predictive Sales AI (PSAI), we’re redefining how technology and intelligence transform digital marketing. Our AI-powered software enables home services businesses to make smarter, faster decisions—fueling growth through automation, prediction, and precision.

We are seeking a Data Science Engineer with strong data engineering and MLOps expertise to build scalable, production-grade ML and data platforms that directly impact customer growth and retention.

Job Overview

As a Data Science Engineer
, you will design and operate the data machine learning foundations behind PSAI’s predictive products. You will build scalable pipelines and robust warehouse/lakehouse models across CRM, marketing, product events, and external datasets — ensuring reliability, accuracy, and business continuity at scale.

This Role Requires

4 years in data-centric engineering
Proven experience deploying ML models via pipelines
Deep expertise in Python, SQL, and Azure infrastructure
Architectural ownership through data contracts and resilient modeling

Key Responsibilities

Build scalable batch and near-real-time ingestion pipelines using Azure Data Factory, APIs, event streams, and external connectors.
Develop ML-ready datasets across CRM, marketing automation platforms, product telemetry, and geospatial data sources.
Design performant, well-modeled warehouse/lakehouse systems in Azure Synapse or Databricks.
Train and deploy predictive models (lead scoring, churn prediction, forecasting) through reproducible pipelines.
Build time-aware, leakage-resistant feature pipelines for production ML use cases.
Support full MLOps lifecycle using Azure Machine Learning, including experiment tracking, model registry, and deployment.
Implement automated validation, anomaly detection, reconciliation, and monitoring for pipelines and warehouse models.
Design and enforce data contracts to prevent upstream schema changes from breaking downstream ML workflows.
Own pipeline SLAs, alerting, incident response, and durable improvements through postmortems.
Optimize processing for very large datasets (>100GB) through partitioning, incremental loads, distributed compute, and query tuning.
Improve cost efficiency across compute/storage in Azure environments.
Maintain clean, testable, production-ready Python codebases using:
Object-oriented patterns
Type hinting
CI/CD workflows via Azure Dev Ops
Package models and pipelines using Docker for consistent deployment across dev/staging/prod.
Communicate architectural trade-offs and technical debt in business terms to Product, Rev Ops, and leadership.
Partner with Engineering on instrumentation and scalable data integration.
Mentor junior engineers through pairing, code reviews, and documentation best practices.

Desired Traits

We are looking for an individual who is organized, proactive, and detail-oriented. In this role, you will work closely with teams across the company. Here’s what we’re looking for:

Ownership mindset with a reliability-first approach
Strong SQL/Python and a high attention to data quality
Scales systems thoughtfully (performance/cost aware, maintainable designs)
Collaborative communicator across engineering, Rev Ops, and analytics
Documents well and supports others through reviews/mentorship

Required Skills And Experience

Preferred Master’s degree in Data Science, Computer Science, Statistics, Engineering, or a closely related quantitative field.
4 years in data engineering, ML engineering, or data platform development.
Minimum 2 years deploying ML models into production workflows.
Experience building pipelines and warehouse systems at scale (>100GB datasets).
Demonstrated adaptability in fast-changing technical and business environments.
Python (Expert): pandas, polars, scikit-learn;
PyTorch, transformers; production engineering (OOP, testing, typing)
SQL (Expert): advanced analytics, recursive CTEs, query tuning, Azure Synapse optimization
Azure Data & ML Stack:
Data Factory (ETL/ELT), Azure ML (MLOps), Key Vault, Databricks/Spark, Docker deployment
Distributed & Large-Scale Compute:
Spark, Ray, Dask; GPU acceleration with RAPIDS (plus)
Geospatial…


Increase/decrease your Search Radius (miles)



Job Posting Language