Data Architect, Data Foundry
Job in
Indianapolis, Marion County, Indiana, 46202, USA
Listed on 2026-04-25
Listing for:
Lilly
Full Time
position Listed on 2026-04-25
Job specializations:
-
IT/Tech
Data Engineer, Data Scientist, Data Analyst, AI Software Engineer
Job Description & How to Apply Below
We give our best effort to our work, and we put people first. We're looking for people who are determined to make life better for people around the world.
** Position:*
* ** Data Architect, Data Foundry*
* *
* Location:
** San Diego, CA;
San Francisco, CA;
Boston, MA;
Louisville, CO;
Indianapolis, IN
** Overview*
* ** Lilly Small Molecule Discovery
** is purpose-built to create molecules that make life better for people.
** Discovery Technology and Platforms (DTP)
** accelerates molecule discovery by building optimized foundational platforms, streamlining lab operations through advanced technologies and data connectivity, and investing in novel capabilities.
** Data Foundry
** is a multidisciplinary team within DTP that enables AI-native drug discovery through four integrated pillars:
** Architecture4
Insight** (data infrastructure and scientific software),
** Methods4
Insight** (analytical and computational methods),
** Automation & Scale4
Insight** (lab automation and agentic workflows), and
** Preparedness4
Insight** (data governance and readiness). These pillars empower every Lilly scientist to make optimal decisions by providing seamless access to data, insights, and AI-driven capabilities-serving both human scientists and autonomous AI agents.
** Position Summary*
* We are seeking
** Data Architects
** at multiple levels to design and build the data infrastructure that makes AI-native drug discovery possible. You will create the schemas, ontologies, data models, knowledge graphs, and platform architectures that transform raw scientific data into machine-actionable, FAIR-compliant, insight-ready assets-serving both discovery scientists and autonomous AI agents.
This role is the foundation of
** Architecture4
Insight** . Everything the software engineering team builds-pipelines, APIs, prototypes-depends on the data models and platform architecture this team designs. You will work with deep knowledge of scientific data (chemical, biological, HTE, automation-generated) to create custom-fit solutions, then partner with
** Tech@Lilly
** to scale and maintain them. The role spans three focus areas depending on expertise:
** data modeling & ontologies** ,
** data platform & lakehouse architecture** , and
** knowledge graph & specialized data systems** .
** Responsibilities*
* ** Data Modeling & Ontologies*
* + Design and implement data models, schemas, and ontologies for chemical, biological, and automation-generated data that serve discovery workflows across the portfolio.
+ Define and maintain controlled vocabularies, metadata standards, and FAIR-compliant data frameworks in partnership with Preparedness4
Insight.
+ Implement semantic data standards (RDF, OWL, SPARQL) and ontology engineering practices to create interoperable, machine-readable scientific data.
** Data Platform & Lakehouse Architecture*
* + Design and implement data lakehouse architecture using modern platforms (Databricks, Snowflake, or equivalent), including data storage patterns, partitioning strategies, and query optimization.
+ Build and optimize ETL/ELT pipelines using Spark, dbt, or similar tools to transform raw scientific data into analytical and ML-ready formats.
+ Implement real-time and streaming data integration (Kafka, Kinesis, event-driven patterns) connecting LIMS, instruments, and lab automation systems to the data infrastructure.
** Knowledge Graph & Specialized Data Systems*
* + Design and implement knowledge graphs (Neo4j, Amazon Neptune, Tiger Graph) that capture molecular, target, pathway, and experimental relationships across the discovery landscape.
+ Architect specialized data solutions: array databases (TileDB) for genomics/imaging, document stores (Mongo
DB) for experimental records, and vector databases for embedding-based retrieval supporting ML and RAG workflows.
+ Build query and traversal patterns that enable scientists and AI agents to ask relational questions across the entire data landscape.
** Cross-Functional Partnership*
* + Partner with scientific software engineers to ensure data architectures are implementable, performant, and well-documented.
+ Collaborate with Methods4
Insight to design data structures that support analytical model training, deployment, and evaluation.
+ Work with Tech@Lilly to define scaling strategies, ensure enterprise compliance, and transition data architectures to production-grade management.
+ Contribute to build-versus-buy-versus-adopt decisions by evaluating commercial and open-source data platforms against Data Foundry requirements.
** Basic Requirements*
* + B.S. or M.S. in Computer Science, Data Science,…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×