Statistical Genetics Platform Engineer
Listed on 2026-01-08
-
IT/Tech
Data Scientist, Data Analyst
At Lilly, we unite caring with discovery to make life better for people around the world. We are a global healthcare leader headquartered in Indianapolis, Indiana. Our employees around the world work to discover and bring life-changing medicines to those who need them, improve the understanding and management of disease, and give back to our communities through philanthropy and volunteerism. We give our best effort to our work, and we put people first.
We’re looking for people who are determined to make life better for people around the world.
The Lilly research environment is evolving to centralize the access and analysis of human genetic data. This new initiative will work to define data, tools and process to provide the therapy area teams key evidence for target evaluation and target discovery.
Many different therapy areas across Eli Lilly focus on new therapeutic approaches for the treatment of many different diseases. Starting from an idea, we work with partners across Lilly to discover and develop novel biologic, small molecule and nucleic acid-based therapeutics. Our focus is the patient: by understanding the biology and pathophysiology underlying disease states, we aim to address the root cause of disease and develop breakthrough therapies.
We have one of the strongest pipelines in the industry and a track record of delivering impactful medicines that improve people’s lives.
In this hands-on role, the Statistical Genetics Platform Engineer will join a team that enables statistical geneticists to derive scientific insights from internal and external human genetic data. The ultimate purpose being to drive data-driven decision-making within the organization. The successful candidate will collaborate with team members and also with data engineers and platform architects across the Lilly research environment. The goals of the collaboration will include identifying genetically-based disease targets, finding potential expanded clinical indications for existing assets, classifying and validating patient subpopulations, and understanding disease mechanisms.
The role will support these goals by developing robust computational pipelines that leverage harmonized clinical datasets. This role is a great opportunity to be at the forefront of scientific exploration in a dynamic research field.
Interested in working on an innovative team focusing on providing clear evidence for therapeutic targets? Apply today!
Key Responsibilities:
Design and implement robust, scalable computational pipelines for statistical genetics analyses, including workflows for GWAS, polygenic risk scores, fine-mapping, colocalization and variant annotation
Develop and maintain platform tools and APIs that enable researchers to efficiently process genomic data at scale (biobanks, population cohorts, multi-omics datasets)
Build infrastructure for reproducible research, including containerization, workflow orchestration, and version control for analytical pipelines
Optimize computational performance of statistical genetics algorithms and implement distributed computing solutions for large-scale analyses
Collaborate with statistical geneticists and computational biologists to translate methodological innovations into production-ready software
Establish best practices for data access, quality control, validation, and documentation across genomic analysis pipelines
Maintain and improve existing codebases, ensuring code quality, testing coverage, and comprehensive documentation
Monitor platform performance, solve issues, and implement improvements based on user feedback and evolving research needs
Support the integration of AI-based tools and required MLOps infrastructure
Basic Requirements:
Master’s in Computer Science, Statistical Genetics, Bioinformatics or related field and 6+ years post-Master’s experience (in industry or large-scale non-academic institutions, e.g. Broad, NIH),
OR PhD in Computer Science, Statistical Genetics, Bioinformatics or related field and 3+ years post-PhD experience (in industry or large-scale non-academic institutions, e.g. Broad, NIH)
Key Requirements:
Strong programming skills in languages commonly used in genomics…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).