Sr. Data Scientist Job Sacramento area,California USA,IT/Tech

At Veracyte, we offer exciting career opportunities for those interested in joining a pioneering team that is committed to transforming cancer care for patients across the globe. Working at Veracyte enables our employees to not only make a meaningful impact on the lives of patients, but to also learn and grow within a purpose driven environment. This is what we call the Veracyte way its about how we work together, guided by our values, to give clinicians the insights they need to help patients make life-changing decisions.

Our Values:

We Seek A Better Way
:
We innovate boldly, learn from our setbacks, and are resilient in our pursuit to transform cancer care
We Make It Happen
:
We act with urgency, commit to quality, and bring fun to our hard work
We Are Stronger Together
:
We collaborate openly, seek to understand, and celebrate our wins
We Care Deeply
:
We embrace our differences, do the right thing, and encourage each other

The Position:

We are looking for an experienced data scientist to empower our medical diagnostic capabilities. As an integral part of our R&D, you will be called on to research and develop data-centric analytic methods. The successful candidate will be in charge of researching and developing algorithmic pipeline across various system levels, from raw genomic data processing, through features extraction up to machine learning, statistical assessment and optimization.

The successful candidate will be an essential team player for significant optimization of our growing genomic data analytic platform. This entails working cross-functionally with bioinformaticians, clinical analysts, software engineers and business analysts to ensure Veracyte's data becomes powerful asset and value growth driver. Specifically, the role assumes participation in the design and prototyping of cutting-edge deep learning, machine learning and statistical algorithms for analysis of genetic data to improve clinical performance, hacking the high-throughput data processing pipelines - to replace heavy repetitive computations with trained models, and utilize DL/ML on a system level to shorten the processing times and costs.

The ideal candidate will have the ability to see the big picture and to understand user needs and requirements and to identify the opportunities for applying solutions from the deep learning, machine learning, algorithms and data science domains.

Responsibilities

Conduct advanced analyses of next-generation sequencing data, from individual research experiments to very large-scale datasets.
Research, design, and implement novel deep learning architectures, statistical algorithms, and mathematical models to advance applications in genomics.
Contribute to the design and development of data specification standards and scalable data science processes, ensuring consistency and reproducibility.
Derive robust and correct conclusions grounded in rigorous mathematical analysis and principles of probability and statistics.
Collaborate closely with R&D, production, and clinical teams to develop and optimize pipelines for high-throughput sequencing data.
Act as a subject matter expert for data- and algorithm-related issues, providing guidance and troubleshooting support.
Ensure internal processes, metrics, and workflows align to deliver the highest product quality and scientific rigor.
Develop and maintain Python-based software infrastructure to support algorithm testing, benchmarking, and simulation studies.
Prepare clear, concise, and accurate technical documentation, including research reports, algorithm specifications, and data standards.
Partner across the organization to design and implement end-to-end data application pipelines for large-scale sequencing workflows.
Proactively identify opportunities to improve existing technology and contribute ideas that drive innovation and future product development.

Who You Are

Masters degree / PhD in engineering, computer science, bioinformatics, statistics, or similar, with a strong foundation in probability theory and/or statistics.
A minimum of 6 years of practical experience in scientific methods, processes, algorithms and systems for knowledge and insights extraction from data
Extensive knowledge in the domains of machine learning, signal processing and related quantitative techniques and theory
Strong foundation in probability theory and/or statistics including parameter estimation and hypothesis testing
Knowledge of core concepts in data analysis and applied statistics - model testing, regression, classification, clustering, data mining
Good Python, bash and scripting fundamentals, including file parsing and data visualization
Experience with cloud computing technologies (AWS preferred) - a plus
Proficiency in at least one major deep learning framework (PyTorch, Tensor Flow, MXNet) a plus
Knowledge of bioinformatics tools, clinical diagnostics workflows or genetic processes - a plus
Experience with data management in a regulated environment - a plus
Desire to learn about human genetics,…


Increase/decrease your Search Radius (miles)



Job Posting Language