×
Register Here to Apply for Jobs or Post Jobs. X

Principal Data Engineer ML Platforms

Job in Arlington, Arlington County, Virginia, 22201, USA
Listing for: Ccrps
Full Time position
Listed on 2025-12-15
Job specializations:
  • IT/Tech
    AI Engineer, Data Engineer
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Principal Data Engineer ML Platforms

Altarum Institute

Altarum | Data & AI Center of Excellence (CoE)

Altarum is building the future of data and AI infrastructure for public health - and we are looking for a Principal Data Engineer ML Platforms to help lead the way. In this cornerstone role, you will design, build and operationalize the modern data and ML platform capabilities that power analytics, evaluation, AI modeling and interoperability across all Altarum divisions. If you want to architect impactful systems, enable data science at scale, and help ensure public health and Medicaid programs operate with secure, explainable, and trustworthy AI - this role is for you.

What

You'll Work On

This role blends deep engineering with applied ML enablement:
ML Platform Engineering
: modern lakehouse architecture, pipelines, MLOps lifecycle.
Applied ML enablement: risk scoring, forecasting, Medicaid analytics.
NLP/Generative AI support: RAG, vectorization, health communications.
Causal ML operationalization: evaluation modeling workflows.
Responsible/Trusted AI engineering: model cards, fairness, compliance. Your work ensures that Altarum public health and Medicaid programs run on secure, scalable, reusable, and explainable data and AI infrastructure.

What You'll Do
  • Platform Architecture & Delivery
  • Design and operate modern, cloud-agnostic lakehouse architecture using object storage, SQL/ELT engines, and dbt.
  • Build CI/CD pipelines for data, dbt, and model delivery (Git Hub Actions, Git Lab, Azure Dev Ops).
  • Implement MLOps systems: MLflow (or equivalent), feature stores, model registry, drift detection, automated testing.
  • Engineer solutions in AWS and AWS Gov Cloud today, with portability to Azure Gov or GCP.
  • Use Infrastructure-as-Code (Terraform, Cloud Formation, Bicep) to automate secure deployments.
Pipelines & Interoperability
  • Build scalable ingestion and normalization pipelines for healthcare and public health datasets, including:
    • FHIR R4 / US Core (strongly preferred)
    • HL7 v2 (strongly preferred)
    • Medicaid/Medicare claims & encounters (strongly preferred)
    • SDOH & geospatial data (preferred)
    • Survey, mixed-methods, and qualitative data
  • Create reusable connectors, dbt packages, and data contracts for cross-division use.
  • Publish clean, conformed, metrics-ready tables for Analytics Engineering and BI teams.
  • Support Population Health in turning evaluation and statistical models into pipelines.
Data Quality, Reliability & Cost Management
  • Define SLOs and alerting; instrument lineage & metadata; ensure 95% of data tests pass.
  • Perform performance and cost tuning (partitioning, storage tiers, autoscaling) with guardrails and dashboards.
Applied ML Enablement
  • Build production-grade pipelines for risk prediction, forecasting, cost/utilization models, and burden estimation.
  • Develop ML-ready feature engineering workflows and support time-series/outbreak detection models.
  • Integrate ML assets into standardized deployment workflows.
Generative AI Enablement
  • Build ingestion and vectorization pipelines for surveys, interviews, and unstructured text.
  • Support RAG systems for synthesis, evaluation, and public health guidance.
  • Enable Palladian Partners with secure, controlled-generation environments.
Causal ML & Evaluation Engineering
  • Translate R/Stata/SAS evaluation code into reusable pipelines.
  • Build templates for causal inference workflows (DID, AIPW, CEM, synthetic controls).
  • Support operationalization of ARAs applied research methods at scale.
Responsible AI, Security & Compliance
  • Implement Model Card Protocol (MCP) and fairness/explainability tooling (SHAP, LIME).
  • Ensure compliance with HIPAA, 42 CFR Part 2, IRB/DUA constraints, and NIST AI RMF standards.
  • Enforce privacy‑by‑design: tokenization, encryption, least‑privilege IAM, and VPC isolation.
Reuse, Shared‑Services, and Enablement
  • Develop runbooks, architecture diagrams, repo templates, and accelerator code.
  • Pair with data scientists, analysts, and SMEs to build organizational capability.
  • Provide technical guidance for proposals and client engagements.
Your First 90 Days – You Will Make a Meaningful Impact Fast. Expected Outcomes Include:
  • Platform skeleton operational: repo templates, CI/CD, dbt project,…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary