Sr. Data Scientist
Listed on 2025-11-06
-
IT/Tech
Data Scientist, Machine Learning/ ML Engineer, Data Analyst, Data Engineer
At Veracyte, we offer exciting career opportunities for those interested in joining a pioneering team that is committed to transforming cancer care for patients across the globe. Working at Veracyte enables our employees to not only make a meaningful impact on the lives of patients, but to also learn and grow within a purpose driven environment. This is what we call the Veracyte way its about how we work together, guided by our values, to give clinicians the insights they need to help patients make life-changing decisions.
Our Values:
- We Seek A Better Way
:
We innovate boldly, learn from our setbacks, and are resilient in our pursuit to transform cancer care - We Make It Happen
:
We act with urgency, commit to quality, and bring fun to our hard work - We Are Stronger Together
:
We collaborate openly, seek to understand, and celebrate our wins - We Care Deeply
:
We embrace our differences, do the right thing, and encourage each other
We are looking for an experienced data scientist to empower our medical diagnostic capabilities. As an integral part of our R&D, you will be called on to research and develop data-centric analytic methods. The successful candidate will be in charge of researching and developing algorithmic pipeline across various system levels, from raw genomic data processing, through features extraction up to machine learning, statistical assessment and optimization.
The successful candidate will be an essential team player for significant optimization of our growing genomic data analytic platform. This entails working cross-functionally with bioinformaticians, clinical analysts, software engineers and business analysts to ensure Veracyte's data becomes powerful asset and value growth driver. Specifically, the role assumes participation in the design and prototyping of cutting-edge deep learning, machine learning and statistical algorithms for analysis of genetic data to improve clinical performance, hacking the high-throughput data processing pipelines - to replace heavy repetitive computations with trained models, and utilize DL/ML on a system level to shorten the processing times and costs.
The ideal candidate will have the ability to see the big picture and to understand user needs and requirements and to identify the opportunities for applying solutions from the deep learning, machine learning, algorithms and data science domains.
Responsibilities- Conduct advanced analyses of next-generation sequencing data, from individual research experiments to very large-scale datasets.
- Research, design, and implement novel deep learning architectures, statistical algorithms, and mathematical models to advance applications in genomics.
- Contribute to the design and development of data specification standards and scalable data science processes, ensuring consistency and reproducibility.
- Derive robust and correct conclusions grounded in rigorous mathematical analysis and principles of probability and statistics.
- Collaborate closely with R&D, production, and clinical teams to develop and optimize pipelines for high-throughput sequencing data.
- Act as a subject matter expert for data- and algorithm-related issues, providing guidance and troubleshooting support.
- Ensure internal processes, metrics, and workflows align to deliver the highest product quality and scientific rigor.
- Develop and maintain Python-based software infrastructure to support algorithm testing, benchmarking, and simulation studies.
- Prepare clear, concise, and accurate technical documentation, including research reports, algorithm specifications, and data standards.
- Partner across the organization to design and implement end-to-end data application pipelines for large-scale sequencing workflows.
- Proactively identify opportunities to improve existing technology and contribute ideas that drive innovation and future product development.
- Masters degree / PhD in engineering, computer science, bioinformatics, statistics, or similar, with a strong foundation in probability theory and/or statistics.
- A minimum of 6 years of practical experience in scientific methods, processes, algorithms and systems for knowledge and insights extraction from data
- Extensive knowledge in the domains of machine learning, signal processing and related quantitative techniques and theory
- Strong foundation in probability theory and/or statistics including parameter estimation and hypothesis testing
- Knowledge of core concepts in data analysis and applied statistics - model testing, regression, classification, clustering, data mining
- Good Python, bash and scripting fundamentals, including file parsing and data visualization
- Experience with cloud computing technologies (AWS preferred) - a plus
- Proficiency in at least one major deep learning framework (PyTorch, Tensor Flow, MXNet) a plus
- Knowledge of bioinformatics tools, clinical diagnostics workflows or genetic processes - a plus
- Experience with data management in a regulated environment - a plus
- Desire to learn about human genetics,…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).