Technical Professional - Computational Biologist/Bioinformatics Scientist; Temporary
Listed on 2026-02-18
-
Research/Development
Data Scientist -
IT/Tech
Data Scientist, Machine Learning/ ML Engineer
Technical Professional - Computational Biologist / Bioinformatics Scientist (Temporary)
The Biosciences Division at Oak Ridge National Laboratory seeks a Technical Professional to support computational biology research within the Plant-Microbe Interfaces (PMI) Science Focus Area and the GPTgp (Generative Pretrained Transformer for Genomic Photosynthesis) project. This position focuses on developing machine learning pipelines, AI-driven scientific workflows, and data infrastructure to accelerate discovery in Populus genomics and the characterization of Populus-associated microbial communities.
The successful candidate will design and implement scalable ML frameworks for predicting microbial phenotypes and interactions from genomic data, build AI agent systems that enable researchers to interact with complex datasets through natural language, and establish robust data infrastructure supporting foundation model development. This role emphasizes the creation of reusable computational
Major Duties/Responsibilities:
- Design, develop, and validate machine learning pipelines for predicting microbial phenotypes (e.g., carbon utilization, growth characteristics) and microbial interactions from genomic features, with emphasis on generalizable frameworks applicable across bacterial collections
- Architect and implement AI agent workflows using large language models, including tool‑calling patterns, multi‑agent orchestration, and retrieval‑augmented generation (RAG) systems for scientific applications
- Build programmatic APIs and natural language interfaces that enable researchers to query, retrieve, and analyze Populus genomic data and associated microbial genome collections
- Design and implement data lakehouse architecture, including schema design, data modeling, and ETL pipeline development for multi‑modal genomic and phenotypic datasets
- Develop reproducible computational workflows for genome annotation and omics data processing of Populus and Populus-associated microbes, optimized for high‑performance computing environments
- Analyze biological sequencing data including 16S amplicon, metagenomics, and RNA‑seq datasets to support plant‑microbe research objectives
- Contribute to peer‑reviewed publications and technical reports; present research at scientific conferences
- Collaborate with experimental biologists and domain scientists to translate research questions into computational solutions
- Deliver ORNL’s mission by aligning behaviors, priorities, and interactions with our core values of Impact, Integrity, Teamwork, Safety, and Service. Promote equal opportunity by fostering a respectful workplace – in how we treat one another, work together, and measure success.
Basic Qualifications:
- Ph.D. in Bioinformatics, Computational Biology, Computer Science, or a related quantitative field, with 2+ years of relevant postdoctoral or professional experience
- Proficiency in Python and experience with scientific computing and ML libraries (e.g., Num Py, pandas, scikit‑learn, PyTorch)
- Demonstrated experience building end‑to‑end machine learning pipelines, from feature engineering through model evaluation and deployment
- Experience analyzing high‑throughput sequencing data (16S, metagenomics, or transcriptomics)
- Familiarity with workflow orchestration tools and containerization for reproducible analysis pipelines
- Strong written and oral communication skills, with a track record of scientific publications or technical documentation
- Ability to work both independently and collaboratively in a research environment
Preferred Qualifications:
- Experience developing LLM‑powered applications using frameworks such as Lang Chain, including agent design, tool integration, and prompt engineering
- Experience with data lakehouse technologies, particularly Delta Lake, for managing large‑scale scientific datasets
- Background in genotype‑to‑phenotype prediction or microbial trait modeling
- Background in REST API development and database management (Postgre
SQL, MySQL) - Proficiency with HPC environments and job schedulers (e.g., SLURM)
- Experience with deep learning frameworks and neural network architectures
- Familiarity with plant genomics, microbial genomics, or…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).