Data Science Engineer
Listed on 2025-12-26
-
IT/Tech
Data Scientist, Machine Learning/ ML Engineer, Data Analyst, AI Engineer
This range is provided by Berkeley Lab. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.
Base pay range$/yr - $/yr
Direct message the job poster from Berkeley Lab
Lawrence Berkeley National Laboratory is hiring a Data Science Engineer within the Scientific Data division. The Computational Biosciences Group has an immediate opening for a software and data engineer (CSE2) in the area of multi-modal data modeling and analysis with applications to bioscience research. You will develop new methods and software tools that enable scientific knowledge discovery using modern data management and machine learning technologies and advance the state-of-the-art in data-intensive analysis.
Your projects will focus on the domains of omics/structural biology data and neurophysiology data. Under limited instruction, you will be part of an experienced team conducting R&D in the areas of FAIR data science, AI, and modern methods for data understanding. You will be working as part of a multi-disciplinary team composed of computer scientists, data scientists, and bioinformaticians.
You will:
- Design and develop user-friendly software packages for scientific data analysis and management
- Develop machine learning and AI solutions for analysis of biological data in close collaboration with diverse teams of scientists
- Work with domain experts to develop FAIR data models and management solutions for bioscience applications
- Work closely with the community of developers of the Neurodata Without Borders and LinkML open source data ecosystems, as well as the Joint Genome Institute
. - Maintain and manage open source software products, including managing development priorities, software releases, continuous integration, and testing
- Design, implement and maintain software tools for creating and running parallel data intensive analysis workflows
- Design, implement and maintain high performance computing and cloud solutions for visualization and analysis of complex biological data
- Train scientists and research software engineers in the use of the developed software products at workshops and conferences
- Work on and resolve problems of diverse scope where analysis of data requires evaluation of identifiable factors.
- Demonstrate good judgment in selecting methods and techniques for obtaining solutions.
- Network with senior internal and external personnel in their own area of expertise.
We are looking for:
- Typically requires a minimum of 5 years of related experience with a Bachelor’s degree in computer science, data science, machine learning, bioinformatics, or equivalent; or 3 years and a Master’s degree; or equivalent work experience designing and developing software for data modeling or analysis.
- Experience developing complex software solutions
- Experience contributing to community-driven open source software
- Demonstrated experience in one or more of the following areas: machine learning, data management, scientific data analysis
- Works well in a collaborative team environment
- Demonstrated capability with the Git version control and continuous integration systems, such as Git Hub or Git Lab
- Ability to troubleshoot and solve problems of diverse scope where analysis of data requires evaluation of identifiable factors.
- Ability to network with senior internal and external personnel in their own area of expertise.
- Excellent oral and written communication skills.
- Demonstrated ability to work effectively as part of a cross-disciplinary team.
Desired skills/knowledge:
- Master’s or PhD in Computer Science or related field, with 5 or more years of professional experience designing and developing scientific data modeling or analysis software
- Experience working with modern scientific data formats and database systems, such as HDF5, Zarr, Mongo
DB, Postgre
SQL, MySQL, and Redis - Experience with Neurodata Without Borders, LinkML, or similar software ecosystems
- Experience working with large biological data, such as in the areas of neurophysiology, microbiology, genomics, or protein design
- Experience working with modern parallel compute technologies, such as Cloud, High-Performance Computing, containerization, and parallel…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).