Postdoctoral Fellow in Biostatistics & Health Data Science Job Bloomington area,Indiana USA,IT/Tech

Postdoctoral Fellow in Biostatistics & Health Data Science

Indiana University is an equal opportunity employer and provider of ADA services and prohibits discrimination in hiring.

Research Context & Opportunity

Modern healthcare increasingly depends on integrating data across hospitals, registries, cohorts, and public health systems. Yet semantic heterogeneity—differences in terminology, structure, and logic—remains a central barrier to reusability, interoperability, and reproducibility.

This postdoctoral position addresses a fundamental and timely research question:
How can Large Language Models (LLMs) and intelligent agents support transparent, scalable, and auditable clinical data harmonization?

We are particularly interested in:

LLM‑driven systems for aligning real‑world health data to standards such as OMOP CDM, FHIR, and UMLS
Agent‑based workflows that explain, refine, and adapt semantic mappings over time
Hybrid architectures that combine knowledge‑grounded reasoning with flexible machine learning
Tools that reduce manual burden while preserving traceability and clinical interpretability

This position offers the opportunity to publish novel methods, work with real messy multi‑source data, and contribute to infrastructure supporting population‑level research and health equity. The postdoctoral fellow will be based in the Department of Biostatistics and Health Data Science at Indiana University School of Medicine, in close collaboration with the Regenstrief Institute.

Responsibilities

Design and implement LLM‑based methods for clinical data harmonization, semantic normalization, and ontology alignment
Develop multi‑agent or retrieval‑augmented generation workflows for schema matching and terminology mapping
Collaborate with national and multi‑institutional initiatives in data integration and standardization
Support open‑source tooling, reproducible pipelines, and standards‑based approaches (OMOP, FHIR, UMLS)
Lead or support manuscript preparation and dissemination at top informatics and AI venues
Contribute to grant development and proposal writing

Qualifications

Required Qualifications:

Ph.D. (by start date) in Computer Science, Biomedical Informatics, Health Data Science, Biostatistics, or a closely related area.
Strong machine‑learning/deep‑learning foundation plus expertise in at least one of: multimodal learning, time‑series modeling, or NLP.
Demonstrated working experience with healthcare data (e.g., EHR, clinical text, imaging, omics).
Proficiency in Python and ML tooling (PyTorch, scikit‑learn), version control (Git), and experiment tracking (e.g., Weights & Biases).
Excellent written and oral communication skills, and ability to collaborate with multidisciplinary teams.

Preferred Qualifications:

Experience with concept normalization, ontology mapping, or schema alignment
Familiarity with LLM agents, tool‑augmented reasoning, or hybrid rules+LLM systems
Record of publications in relevant domains (informatics, machine learning, AI, knowledge representation)
Experience with multi‑site data harmonization or federated data environments

Benefits

A collaborative environment at the intersection of real‑world data, applied AI, and translational science
Opportunities to work across academic, clinical, and public health settings
Mentorship and support toward independent research or career development in academia or industry
Competitive salary and benefits through Indiana University
A culture that values both scientific innovation and practical impact

#J-18808-Ljbffr