ML Research Scientist I/II, Multimodal Data Extraction
Listed on 2025-11-20
-
Research/Development
Data Scientist, Artificial Intelligence, Research Scientist
ML Research Scientist I/II, Multimodal Data Extraction
Your Impact at Lila
As an ML Research Scientist - Multimodal Data Extraction, you will advance Lila’s vision of scientific superintelligence by developing foundation models that autonomously read, interpret, and structure scientific knowledge across text, images, and experimental data in the physical sciences. Your research will help unify the world’s scientific information into machine‑understandable form, powering reasoning, prediction, and autonomous discovery across materials science and chemistry.
WhatYou’ll Be Building
- Research and develop AI systems that extract and structure knowledge from diverse scientific sources.
- Design and fine‑tune large language, multi‑modal and specialized models for factual, interpretable data extraction.
- Build scalable pipelines for unstructured and heterogeneous scientific data, integrating text, tables, and visuals.
- Collaborate with domain experts to align extracted data with real‑world discovery workflows.
- Publish research that advances the state of the art in multimodal understanding and AI‑driven knowledge extraction.
- PhD (or equivalent research experience) in Computer Science, Chemistry, Materials Science, or a related field.
- Expertise in machine learning, NLP, and vision–language modeling using PyTorch and Hugging Face Transformers.
- Proven ability to train, fine‑tune, and evaluate LLMs and multimodal models for scientific data extraction.
- Strong understanding of data structures and representations used in the physical sciences.
- Demonstrated research impact through publications, preprints, or open‑source work (e.g., NeurIPS, ICLR, ICML, ACL, EMNLP, scientific journals).
- Experience with multimodal fusion architectures and document‑level understanding.
- Knowledge of scientific document parsing (OCR, table extraction, figure‑caption linking).
- Familiarity with knowledge‑graph construction or reasoning systems for science.
- Experience with noisy or heterogeneous real‑world scientific data.
- Collaborative mindset and passion for advancing AI in the physical sciences.
Lila Sciences is the world’s first scientific superintelligence platform and autonomous lab for life, chemistry, and materials science. We are pioneering a new age of boundless discovery by building the capabilities to apply AI to every aspect of the scientific method. We introduce scientific superintelligence to solve humankind’s greatest challenges, enabling scientists to bring forth solutions in human health, climate, and sustainability at a pace and scale never experienced before.
Learn more about this mission a.ai.
We expect the base salary for this role to fall between $176,000–$304,000 USD per year
, along with bonus potential and generous early equity. The final offer will reflect your unique background, expertise, and impact.
Lila Sciences is committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or veteran status.
Agency PolicyLila Sciences does not accept unsolicited resumes from any source other than candidates. The submission of unsolicited resumes by recruitment or staffing agencies to Lila Sciences or its employees is strictly prohibited unless contacted directly by Lila Sciences’ internal Talent Acquisition team.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).