Data Analyst - Surgical Oncology Research
Listed on 2026-05-26
-
IT/Tech
Data Scientist, Data Analyst, Machine Learning/ ML Engineer
Data Scientist – Clinical Research Informatics
The Data Scientist will join the University of Texas MD Anderson Cancer Center within the Clinical Research Informatics Program (CRIP), a specialized informatics group embedded in a surgical oncology division. The Data Scientist will support active, funded cancer research by developing analytic datasets, machine learning workflows, and large language model-based pipelines that enable high‑quality, reproducible oncology research. The Clinical Research Informatics Data Scientist will work closely with clinicians, informaticians, and cancer researchers to translate complex clinical questions into robust data and AI solutions.
The University of Texas MD Anderson Cancer Center is a leading institution focused on cancer care, research, education, and prevention. As part of UT MD Anderson, the Clinical Research Informatics Data Scientist contributes to a mission‑driven environment where advanced analytics and clinical insight come together to improve outcomes for patients with cancer. The role is designed for individuals motivated by translational impact, interdisciplinary collaboration, and methodological growth.
Minimum $27.64 - Midpoint $34.62 - Maximum $41.59. Typical work schedule:
Monday‑Friday (minimum 40 hours/8‑hour days). This is a hybrid role with a minimum of one day on‑site in Houston, TX, and additional on‑site presence as required for business or departmental needs.
This role offers the opportunity to work within a first‑of‑its‑kind clinical research informatics program embedded directly in surgical oncology at UT MD Anderson, supporting active, high‑impact cancer research using advanced analytics, machine learning, and LLM‑based abstraction pipelines. The team is lean and highly collaborative, providing visible individual contributions, mentorship‑oriented management, and genuine opportunities to co‑author research while maintaining a sustainable work‑life balance in a hybrid environment.
- Employer‑paid medical coverage starting day one for employees working 30+ hours/week, plus optional group dental, vision, life, AD&D, and disability insurance.
- Accruals for PTO and Extended Illness Bank, plus paid holidays, wellness, childcare, and other leave options.
- Tuition Assistance Program after six months of service and access to extensive wellness, fitness, and employee resource groups.
- Defined‑benefit pension through the Teachers Retirement System, voluntary retirement plans, and employer‑paid life and reduced salary protection programs.
- Maintain standardized analytic datasets for cancer research across multiple data sources.
- Develop and apply common data models, variable definitions, and ontologies.
- Build data transformation pipelines using Python, R, and SQL.
- Maintain metadata, data dictionaries, and analytic documentation.
- Ensure data quality, completeness, and internal consistency across studies.
- Provide ongoing support for database‑related queries.
- Extract and compile data from multiple clinical and research systems.
- Merge datasets from disparate sources.
- Format and standardize data for analysis and reporting.
- Prepare AI‑ready feature sets and longitudinal datasets for predictive modeling in oncology.
- Implement data preprocessing, feature engineering, and validation workflows for ML models.
- Design and implement LLM pipelines to extract cancer‑specific variables from unstructured clinical text.
- Develop, test, and maintain multi‑prompt or multi‑stage LLM workflows.
- Evaluate LLM outputs using gold‑standard annotations and quantitative metrics.
- Support model validation, error analysis, and generalizability testing across cohorts.
- Contribute to reusable analytic and modeling frameworks across disease sites.
- Collaborate with clinicians, informaticians, programmers, and investigators on analytic workflows.
- Participate in interdisciplinary research teams across oncology and informatics.
- Support manuscript‑ and grant‑related analyses with reproducible…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).