As a Data Scientist at Engage, you will be instrumental in measuring, improving, and scaling our AI-first real estate platform. You will own the evaluation framework for our customized Claude models, define what "accurate" means for each AI feature, and create the feedback loops that enable our ML engineers to continuously improve model performance. Beyond AI evaluation, you'll build predictive models for business outcomes like lead scoring and property valuation, support critical business analytics, and build dashboards that drive product and business decisions.
This role bridges technical rigor with practical impact—you'll directly influence both our AI product quality and our understanding of customer behavior.
Key Responsibilities:
- AI/ML Evaluation & Quality Assurance:
Design and implement evaluation frameworks to measure the accuracy, relevance, and reliability of our AI models across different use cases (property search, people search, valuation, recommendations). Build labeled test datasets and benchmark systems to track model performance over time. - Error Analysis & Model Improvement:
Conduct systematic error analysis to identify failure patterns in AI outputs. Work closely with ML engineers to translate findings into actionable improvements—better prompts, refined data retrieval, or model tuning strategies. - Predictive Modeling & Analytics:
Build statistical and machine learning models to support business outcomes—lead scoring, conversion prediction, property valuation models, market trend forecasting, and customer segmentation. Collaborate with ML engineers when models need to scale into production AI systems. - Experimentation & A/B Testing:
Design and analyze experiments to compare model versions, prompt strategies, product features, and business interventions. Define success metrics and provide statistical rigor to product decisions. - Business Analytics & Insights:
Build dashboards and reports to track key product metrics (conversion rates, user engagement, lead quality, feature adoption). Answer ad-hoc business questions using SQL, Python, and BI tools like Metabase. - Data Quality & Monitoring:
Establish data quality standards across the platform—for AI inputs/outputs, property data, user interactions, and business metrics. Create monitoring systems to detect model drift, data anomalies, pipeline failures, and production issues. Work with both SQL and No
SQL databases (Mongo
DB) to ensure data integrity and reliability across all systems. - Cross-functional Collaboration:
Partner with ML engineers, product managers, and business stakeholders to define requirements, prioritize improvements, and communicate findings clearly to both technical and non-technical audiences. - Research & Innovation:
Stay current with best practices in LLM evaluation, RAG system assessment, predictive modeling techniques, and real estate analytics to continuously improve our measurement and modeling capabilities.
Skills & Qualifications:
Required:
- 1-2 years of experience in data science, with hands-on experience evaluating machine learning models or AI systems in production.
- Strong proficiency in Python for data analysis, experimentation, and evaluation (Pandas, Num Py, scikit-learn).
- Advanced SQL skills for data extraction, transformation, and analysis across complex datasets.
- Experience with statistical analysis, A/B testing, and experimental design.
- Familiarity with LLM evaluation techniques—understanding how to measure accuracy, relevance, hallucination, and consistency in generative AI outputs.
- Ability to create clear data visualizations and dashboards using tools like Metabase, Tableau, or similar BI platforms.
- Strong analytical thinking and problem-solving skills—able to break down complex questions into measurable components.
- Excellent communication skills to translate technical findings into actionable recommendations for diverse stakeholders.
Preferred:
- Bachelor's or Master's degree in Computer Science, Statistics, Data Science, Mathematics, or related field.
- Experience with RAG (Retrieval-Augmented Generation) systems evaluation or prompt engineering.
- Knowledge of real estate data, property valuation metrics, or proptech…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).