×
Register Here to Apply for Jobs or Post Jobs. X

AI Data Architect

Job in Exton, Chester County, Pennsylvania, 19341, USA
Listing for: Genzeon
Full Time position
Listed on 2026-06-02
Job specializations:
  • IT/Tech
    Data Engineer, AI Engineer
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below

Genzeon, an AI and automation company with deep engineering and data expertise, dedicated to serving the healthcare and retail industries. Our platform solutions – including HIP One, Compliance Pro Solutions, and Patient Engagement Solutions – empower organizations to scale innovation and transform outcomes.

Genzeon is a global community of innovators and problem-solvers, with a culture built on inclusion, flexibility, and purpose-driven work. With four global delivery centers, we support providers, payers, Healthtech, and retail organizations worldwide.

Genzeon has an exciting opening for AI Data Architect | Healthcare AI Platform to join our dynamic team.

Exton, PA / Hybrid

0–4 years |

The short version

We run a multi-model AI pipeline that processes 150K Medicare documents/year — faxed PDFs, EDI transactions, FHIR data, clinical notes. You’ll design and build the data architecture that ingests, stores, governs, and serves all of it to AI models and clinical reviewers. On-prem GPUs, hybrid cloud, HIPAA compliance. This is the real thing.

What you’ll do
  • Design the end-to-end data architecture for a healthcare AI platform — ingestion, storage, processing, serving, governance
  • Build pipelines for heterogeneous healthcare data: faxed PDFs, X12 EDI (835/837/278), FHIR R4, HL7v2, CMS files, unstructured clinical notes
  • Architect the data lake/lakehouse layer (Apache Iceberg, MinIO, DuckDB, Postgre

    SQL/pgvector)
  • Design the embedding and vector storage layer that powers RAG — chunking, indexing, retrieval optimization
  • Build data lineage tracking from source document to AI decision
  • Implement HIPAA/HITRUST data governance — encryption, access controls, audit logging, PHI handling
  • Monitor data quality across the pipeline — schema drift, completeness, freshness, anomalies
  • Optimize for hybrid infrastructure: on-prem GPUs (RTX 50U0, L40S), NAS, Azure Gov Cloud, Azure Commercial
What you need
  • A data pipeline you’ve built that ran in production (we’ll ask about it).
  • SQL fluency and Python proficiency.
  • Experience with at least one of:
    Spark, dbt, Airflow, Dagster, Prefect.
  • Hands‑on work with unstructured or semi‑structured data — PDFs, images, OCR outputs, free text.
  • Practical understanding of vector databases, embeddings, and how RAG systems consume data.
  • Comfort with on-premises infrastructure, not just managed cloud services.
  • Data quality and governance as instincts, not afterthoughts.
Strong signals
  • Healthcare data formats (X12 EDI, FHIR, HL7, CCD/C-CDA).
  • Apache Iceberg, Delta Lake, or modern table formats.
  • pgvector, Pinecone, Weaviate, or similar vector stores.
  • DuckDB or embedded analytical engines.
  • HIPAA technical safeguards implementation.
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary