Machine Learning Staff Scientist at NSF-NCEMS
Listed on 2026-06-02
-
IT/Tech
Data Scientist, Machine Learning/ ML Engineer, AI Engineer (Applied/Software)
About the Position
Machine Learning Staff Scientist (Research Data Scientist) dedicated to advancing the collaborative research of the Center's Working Groups within NCEMS and ICDS at Penn State.
Work ArrangementThis position may involve a hybrid of remote and on‑site work with a minimum of three days per week on the Penn State University Park campus. Fully remote work is not permitted.
Responsibilities- Collaborate with NCEMS Working Groups to design, develop, and evaluate machine learning approaches for integrating, analyzing, and visualizing molecular and cellular biology data.
- Prepare ML‑ready datasets by leading data wrangling, harmonization, standardization, quality control, and documentation for robust training and reuse across biological modalities.
- Develop end‑to‑end ML workflows (feature learning, training, validation, benchmarking, and uncertainty quantification) for multi‑omics and related data types.
- Build and optimize predictive and generative models (deep learning, probabilistic models, foundation‑model adaptation, graph/neural sequence models) to support synthesis research questions.
- Implement scalable training and inference pipelines using modern ML tooling (PyTorch, Tensor Flow, JAX), version control, containers, and HPC/GPU resources.
- Support the publication of intermediate data products, models, code, and documentation.
- Stay up‑to‑date with the latest advancements in machine learning, AI for biology, and the evolving landscape of public molecular and cellular datasets.
- MS or PhD in Machine Learning, Computational Biology, Bioinformatics, Computer Science, Statistics, Data Science, or a related field (preferred).
- Strong proficiency in Python for scientific computing and machine learning, with experience in ML libraries/frameworks (PyTorch, Tensor Flow, JAX, scikit‑learn).
- Demonstrated knowledge of core ML, deep learning, and statistical methods: regression, classification, clustering, dimensionality reduction, sequence and time‑series modeling, CNNs, RNNs, GNNs, transformers, generative modeling (diffusion, variational/auto‑regressive), self‑/weakly‑supervised learning, NLP, computer vision, and causal inference.
- Experience working with high‑dimensional, large‑scale molecular and cellular datasets (genomic, transcriptomic, epigenomic, proteomic, metabolomic, imaging‑derived, single‑cell, or multi‑omics) and appropriate preprocessing and normalization for ML.
- Solid understanding of molecular and cellular biology concepts to frame ML problems across the central dogma and collaborate with domain scientists.
- Experience with software engineering practices for research‑grade code, version control (Git), reproducible environments (containers, conda), and HPC/GPU computing.
- Publications in peer‑reviewed journals demonstrating contributions to the field.
- Experience supporting or contributing to multi‑PI projects.
Candidates must demonstrate a commitment to ethical conduct, research integrity, a strong work ethic, interpersonal and written communication skills, and the ability to work well in a team environment.
BenefitsPenn State provides a competitive benefits package for full‑time employees, including medical, dental, vision coverage, robust retirement plans, paid time off, and a generous 75% tuition discount for employees and eligible family members.
Legal and EEO StatementEmployment with the University will require successful completion of background checks in accordance with University policies. Penn State is an equal‑opportunity employer and is committed to providing employment opportunities without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, or protected veteran status.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).