Senior Data Scientist
Listed on 2026-02-20
-
Software Development
Data Scientist, Software Engineer, Machine Learning/ ML Engineer, Data Engineer
We need someone who can build high-quality forecasting models for UK energy balancing markets — not a generalist who's touched a bit of everything, but a specialist who genuinely understands time series, knows how to extract signal from massive feature sets, and can produce reliable probabilistic forecasts.
You’ll spend significant time on tasks like: engineering features from raw market data, selecting the most predictive subset from hundreds of thousands of candidates, building gradient boosting models that output well-calibrated prediction intervals, and rigorously validating everything to avoid the subtle leakage problems that plague time series work.
You won’t be responsible for deployment — we have experienced Dev Ops for that. But you’ll need to hand off models that are well‑documented, reproducible, and actually work in production. If you find satisfaction in the craft of building models that hold up under scrutiny — rather than just hitting a metric on a test set — this role is for you.
FeatureEngineering and Selection
- Engineer predictive features from energy market data (prices, volumes, grid conditions, weather, calendar effects)
- Work with feature sets in the hundreds of thousands — you’ll need systematic approaches, not manual inspection
- Apply and evaluate feature selection methods (mRMR, importance‑based selection, recursive elimination) to build parsimonious models
- Analyse feature importance and stability across time periods and market conditions
- Understand the domain well enough to create features that reflect how the balancing market actually works
- Build gradient boosting models (XGBoost, Light
GBM, Cat Boost) for multi‑horizon forecasting - Produce probabilistic forecasts — prediction intervals, quantile regression, or distribution outputs — not just point estimates
- Handle class imbalances appropriately when the problem requires classification
- Design proper time series cross‑validation schemes that respect temporal ordering
- Diagnose and fix target leakage — you should be able to explain why a 'too good' result is suspicious
- Test pipeline components using synthetic/artificial data where ground truth is known
- Validate that preprocessing steps (missing value imputation, outlier handling) don’t introduce leakage
- Build confidence that models will generalise, not just interpolate
- Track experiments systematically (MLflow or similar)
- Maintain reproducible training pipelines with proper configuration management
- Document model decisions, hyperparameter choices, and validation results clearly
- Invest time learning UK energy balancing markets — BM units, settlement periods, system prices, imbalance dynamics
- Translate domain knowledge into model improvements (better features, appropriate loss functions, sensible constraints)
- Collaborate with colleagues who understand the data infrastructure and market context
- Deep time series experience — you understand why random CV splits fail for forecasting, how to handle multiple horizons, and the pitfalls of lookahead bias
- Strong feature engineering and selection skills — you’ve worked with high‑dimensional feature sets and know multiple approaches to reduce them systematically
- Gradient boosting expertise — XGBoost, Light
GBM, or Cat Boost are your core tools; you understand their hyperparameters and when each matters - Probabilistic forecasting ability — you can produce calibrated prediction intervals or quantile forecasts, not just point predictions
- Rigorous validation mindset — you’re paranoid about leakage, you test your assumptions, and you don’t trust results that seem too good
- Python fluency — clean, testable code; comfortable with pandas/Polars, scikit‑learn, and the GBM libraries
- SQL competence — you can pull and reshape data from Postgre
SQL without friction - Clear communication — you document your work and can explain model behaviour to non‑ML colleagues
- Experience with MLflow, Hydra, Metaflow, or similar tooling for experiment tracking and pipeline management
- Polars experience (we’re migrating some workloads from pandas)
- Background in energy, utilities,…
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: