Founding Machine Learning Scientist - Molecular AI
Listed on 2026-02-17
-
Research/Development
Artificial Intelligence, Data Scientist
We are building computational systems to discover and develop small molecule medicines from fungi. Nearly half of all oral medicines originate from natural molecules, yet discovery from nature has historically been slow. Advances in mass spectrometry and computation now make it possible to systematically explore nature’s chemical diversity at scale.
We recently introduced Gaia-01, a 1B-parameter foundation model for molecular structure prediction from mass spectrometry that outperforms current state-of-the-art systems on the Mass Spec Gym benchmark. We are now developing the next generation of this model.
We are looking for a founding machine learning scientist to design and advance models that infer molecular structure and properties directly from mass spectrometry data.
You will take ownership of the next iteration of our molecular foundation model (Gaia-02), extending spectrum-to-structure prediction into broader molecular reasoning and downstream applications. This role sits at the intersection of machine learning, chemistry, and metabolomics, and involves close collaboration with computational biology and experimental teams.
This is a hands‑on, fast‑paced role in an early‑stage company with significant autonomy and technical responsibility.
What you’ll own- Lead the development of the next generation of our molecular foundation model for mass spectrometry
- Design and train models for mass spectra to molecular structure inference
- Develop latent molecular representations from MS/MS and related data
- Extend structure predictions into downstream molecular reasoning (e.g., bioactivity, prioritization)
- Experience developing and training machine learning models in PyTorch or similar frameworks
- Experience designing novel modeling approaches and implementing the latest methods from the literature
- Ability to independently scope and execute research problems involving large, high‑dimensional datasets, including handling noise and distributional shifts
- Experience training models at scale (cloud or HPC environments)
- Strong software engineering skills in Python, including writing clean, well‑structured, production‑quality code
- Experience with generative models (e.g., autoregressive, diffusion, flow) and geometric deep learning (e.g., GNNs, Deep sets, EGNNs)
- Experience working with molecular, chemical, or spectral datasets
- Familiarity with metabolomics or mass spectrometry workflows and computational models (e.g., MIST, DreaMS, ICEBERG)
We value agency, technical depth, and learning velocity more than years of experience.
If you find this exciting and think you'd be a great fit, we’d love to hear from you. We can go from first conversation to offer decision in days.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).