Sr. Data Engineer
Job in
Madison, Dane County, Wisconsin, 53774, USA
Listed on 2026-06-06
Listing for:
Trans Ova
Full Time
position Listed on 2026-06-06
Job specializations:
-
IT/Tech
Data Engineer, Data Analyst, Data Science Manager, Data Scientist
Job Description & How to Apply Below
This role is responsible for the design, development, and maintenance of data integration, analytics, and reporting solutions that support our animal genetics and bioinformatics workloads. The ideal candidate will possess expertise in Databricks and modern data engineering tools such as Azure Data Factory, combined with hands‑on experience working with biological, genomic, or other omics datasets. This position requires a proactive, self‑motivated, and results‑oriented individual with a passion for data, a strong understanding of data architecture and warehousing principles, and an appreciation for bioinformatics workflows in a commercial genetics environment.
Responsibilities- Design, develop, and maintain robust and efficient ETL/ELT pipelines and processes on Databricks for both operational and bioinformatics datasets (e.g., genomic markers, phenotypic data, laboratory outputs).
- Ingest, transform, and harmonize structured and semi‑structured biological data from lab systems, LIMS, sequencing platforms, and external partners into the enterprise data platform.
- Troubleshoot and resolve Databricks pipeline errors and performance issues.
- Optimize data flow performance and minimize data latency across scientific and business use cases.
- Implement data quality checks, validations, and reconciliation processes within ETL workflows, including domain‑specific checks for genomic and phenotypic data.
- Develop and maintain Databricks pipelines, notebooks, and datasets using Python, Spark, and SQL.
- Optimize Databricks jobs for performance and cost‑effectiveness, including large‑scale bioinformatics and analytics workloads.
- Integrate Databricks with other data sources and systems, including lab instruments, genomic databases, and on‑prem or cloud data stores.
- Participate in the design and implementation of data lake architectures that support both traditional analytics and bioinformatics pipelines.
- Participate in the design and implementation of data warehousing solutions to support reporting, analytics, and scientific modeling.
- Model and curate subject areas for genetics, reproduction, and bioinformatics (e.g., animals, pedigrees, genotypes, breeding values, trials).
- Support data quality initiatives and implement data cleansing procedures across business and scientific domains.
- Collaborate with business users, scientists, geneticists, and bioinformaticians to understand data requirements for department‑driven reporting and analytics needs.
- Maintain and extend the existing library of complex dashboards and visualizations to surface genetic, reproductive, and operational insights.
- Enable self‑service analytics for R&D and product teams by exposing well‑governed, documented data products.
- Troubleshoot and resolve report issues, including performance bottlenecks and data inconsistencies.
- Apply strong programming skills in Python, SQL, and Spark to build scalable data and bioinformatics workflows.
- Use CI/CD and IaC tools (Terraform, ARM, Cloud Formation) to automate deployment of data platform components and analytics environments.
- Design and implement Databricks platform architecture on Azure and AWS infrastructure, including environments that support large‑scale scientific computation.
- Contribute to cloud security, governance, and cost optimization practices for data and bioinformatics workloads.
- Partner with geneticists, biostatisticians, and bioinformaticians to translate scientific requirements into scalable data and platform architectures.
- Support or orchestrate bioinformatics pipelines (e.g., variant processing, quality control, annotation, genotype imputation, genomic evaluation) using cloud and Databricks capabilities.
- Ensure that data models, pipelines, and storage structures meet the needs of downstream analytics, predictive models, and genetic evaluations.
- Advocate for best practices in managing sensitive biological and genetic data, including data governance, access control, and compliance with relevant standards and regulations.
- Thrive in an entrepreneurial,…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×