×
Register Here to Apply for Jobs or Post Jobs. X

Machine Learning Scientist​/Sr Scientist, Federated Benchmarking & Validation Engineering

Job in Indianapolis, Hamilton County, Indiana, 46262, USA
Listing for: Eli Lilly and Company
Full Time position
Listed on 2025-12-27
Job specializations:
  • IT/Tech
    Data Scientist, Data Engineer, Big Data, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Location: Indianapolis

At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life‑changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first.

We’re looking for people who are determined to make life better for people around the world.

Purpose

Lilly Tune Lab is an AI‑powered drug discovery platform that provides biotech companies with access to machine learning models trained on Lilly's extensive proprietary pharmaceutical research data. Through federated learning, the platform enables Lilly to build models on broad, diverse datasets from across the biotech ecosystem while preserving partner data privacy and competitive advantages. This collaborative approach accelerates drug discovery by creating continuously improving AI models that benefit both Lilly and our biotech partners.

The Machine Learning Scientist/Sr Scientist, Federated Benchmarking & Validation Engineering plays an essential role within the Tune Lab platform, responsible for identifying, assessing, and implementing cutting‑edge algorithmic solutions that leverage diverse datasets while ensuring data privacy and security for our biotech partners. This position requires comprehensive knowledge in small molecule drug development, ADME/Tox, antibody engineering, and/or genetic medicine, combined with expertise in data science and statistical analysis to develop sophisticated models utilizing federated learning.

This position will be instrumental in advancing both Lilly's pipeline and our partners' drug discovery efforts by designing critical algorithms and workflows that expedite the creation of transformative therapies.

This role centers on constructing robust validation frameworks for federated models, creating privacy‑preserving test sets across partner datasets, establishing standardized benchmarks against public datasets, and ensuring model reproducibility and generalization in diverse deployment scenarios.

Key Responsibilities
  • Federated Test Set Design
    :
    Architect and implement privacy‑preserving protocols for constructing representative test sets across distributed partner datasets, ensuring statistical validity while maintaining data isolation.
  • Benchmark Suite Development
    :
    Create comprehensive benchmark suites covering small molecules (ADMET, solubility, permeability), antibodies (affinity, stability, immunogenicity), and RNA therapeutics (stability, delivery, off‑target effects).
  • Cross‑Domain Validation
    :
    Develop validation strategies that assess model generalization across different experimental protocols, cell lines, species, and therapeutic indications while respecting partner data boundaries.
  • Public Dataset Integration
    :
    Systematically benchmark federated models against public datasets (ChEMBL, Pub Chem, PDB, Therapeutic Antibody Database) to establish performance baselines and identify gaps.
  • Validation Frameworks
    :
    Implement time‑split or proper scaffold‑split validation protocols that assess model performance on prospective data, simulating real‑world deployment scenarios and detecting concept drift.
  • Reproducibility Infrastructure
    :
    Build robust MLOps pipelines ensuring complete reproducibility of federated experiments, including versioning of data snapshots, model checkpoints, and hyperparameter configurations.
  • Statistical Rigor
    :
    Design statistically powered validation studies accounting for multiple testing, hierarchical data structures, and non‑independent observations common in drug discovery datasets.
  • Performance Profiling
    :
    Develop comprehensive performance profiling across diverse molecular scaffolds, target classes, and property ranges, identifying systematic biases and failure modes.
  • Platform Integration
    :
    Collaborate with engineering teams to integrate validation frameworks with the Tune Lab federated learning platform built on NVIDIA FLARE, ensuring scalable and automated testing across partner networks.
Basic…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary