×
Register Here to Apply for Jobs or Post Jobs. X

Data Analyst - Surgical Oncology Research

Job in Houston, Harris County, Texas, 77246, USA
Listing for: University of Texas MD Anderson Cancer Center
Full Time position
Listed on 2026-05-26
Job specializations:
  • IT/Tech
    Data Scientist, Data Analyst, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 60000 - 80000 USD Yearly USD 60000.00 80000.00 YEAR
Job Description & How to Apply Below

Data Scientist – Clinical Research Informatics

The Data Scientist will join the University of Texas MD Anderson Cancer Center within the Clinical Research Informatics Program (CRIP), a specialized informatics group embedded in a surgical oncology division. The Data Scientist will support active, funded cancer research by developing analytic datasets, machine learning workflows, and large language model-based pipelines that enable high‑quality, reproducible oncology research. The Clinical Research Informatics Data Scientist will work closely with clinicians, informaticians, and cancer researchers to translate complex clinical questions into robust data and AI solutions.

The University of Texas MD Anderson Cancer Center is a leading institution focused on cancer care, research, education, and prevention. As part of UT MD Anderson, the Clinical Research Informatics Data Scientist contributes to a mission‑driven environment where advanced analytics and clinical insight come together to improve outcomes for patients with cancer. The role is designed for individuals motivated by translational impact, interdisciplinary collaboration, and methodological growth.

Minimum $27.64 - Midpoint $34.62 - Maximum $41.59. Typical work schedule:
Monday‑Friday (minimum 40 hours/8‑hour days). This is a hybrid role with a minimum of one day on‑site in Houston, TX, and additional on‑site presence as required for business or departmental needs.

Why Us?

This role offers the opportunity to work within a first‑of‑its‑kind clinical research informatics program embedded directly in surgical oncology at UT MD Anderson, supporting active, high‑impact cancer research using advanced analytics, machine learning, and LLM‑based abstraction pipelines. The team is lean and highly collaborative, providing visible individual contributions, mentorship‑oriented management, and genuine opportunities to co‑author research while maintaining a sustainable work‑life balance in a hybrid environment.

  • Employer‑paid medical coverage starting day one for employees working 30+ hours/week, plus optional group dental, vision, life, AD&D, and disability insurance.
  • Accruals for PTO and Extended Illness Bank, plus paid holidays, wellness, childcare, and other leave options.
  • Tuition Assistance Program after six months of service and access to extensive wellness, fitness, and employee resource groups.
  • Defined‑benefit pension through the Teachers Retirement System, voluntary retirement plans, and employer‑paid life and reduced salary protection programs.
KEY RESPONSIBILITIES Data Standardization, Harmonization, and Infrastructure
  • Maintain standardized analytic datasets for cancer research across multiple data sources.
  • Develop and apply common data models, variable definitions, and ontologies.
  • Build data transformation pipelines using Python, R, and SQL.
  • Maintain metadata, data dictionaries, and analytic documentation.
  • Ensure data quality, completeness, and internal consistency across studies.
  • Provide ongoing support for database‑related queries.
Data Extraction and Integration
  • Extract and compile data from multiple clinical and research systems.
  • Merge datasets from disparate sources.
  • Format and standardize data for analysis and reporting.
Machine Learning and LLM Pipeline Development for Research
  • Prepare AI‑ready feature sets and longitudinal datasets for predictive modeling in oncology.
  • Implement data preprocessing, feature engineering, and validation workflows for ML models.
  • Design and implement LLM pipelines to extract cancer‑specific variables from unstructured clinical text.
  • Develop, test, and maintain multi‑prompt or multi‑stage LLM workflows.
  • Evaluate LLM outputs using gold‑standard annotations and quantitative metrics.
  • Support model validation, error analysis, and generalizability testing across cohorts.
  • Contribute to reusable analytic and modeling frameworks across disease sites.
Research Collaboration and Translational Support
  • Collaborate with clinicians, informaticians, programmers, and investigators on analytic workflows.
  • Participate in interdisciplinary research teams across oncology and informatics.
  • Support manuscript‑ and grant‑related analyses with reproducible…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary