Postdoctoral Appointee - Artificial Intelligence Data Science - Hybrid
Lenexa, Johnson County, Kansas, 66215, USA
Listed on 2026-02-12
-
IT/Tech
Data Scientist, Data Science Manager, Data Engineer, AI Engineer
About Sandia:
Sandia National Laboratories is the nation’s premier science and engineering lab for national security and technology innovation, with teams of specialists focused on cutting‑edge work in a broad array of areas. Some of the main reasons we love our jobs:
- Challenging work with amazing impact that contributes to security, peace, and freedom worldwide
- Extraordinary co‑workers
- Some of the best tools, equipment, and research facilities in the world
- Career advancement and enrichment opportunities
- Flexible work arrangements for many positions include 9/80 (work 80 hours every two weeks, with every other Friday off) and 4/10 (work 4 ten‑hour days each week) compressed workweeks, part‑time work, and telecommuting (a mix of onsite work and working from home)
- Generous vacation, strong medical and other benefits, competitive 401k, learning opportunities, relocation assistance and amenities aimed at creating a solid work/life balance*
World‑changing technologies. Life‑changing careers. Learn more about Sandia at: http://(Use the "Apply for this Job" box below)..gov
* These benefits vary by job classification.
What Your Job Will Be Like:Sandia's AI team 1466 is building DOE's next‑generation AI Platform around three pillars Data, Models, and Infrastructure to solve high‑impact "lighthouse problems" in agile deterrence, energy dominance, and critical materials. As a Postdoctoral Appointee, you"ll join the Data Pillar team to design, implement, and operate Sandia's AI‑ready, zero‑trust data ecosystem. Your work will transform raw simulation outputs, sensor and facility logs, experimental records, and production data into governed, provenance‑tracked, and access‑controlled datasets that power AI models, autonomous agents, and mission workflows across DOE's HPC, cloud, and edge environments.
Key Responsibilities:- Build and operate an AI‑Ready Lakehouse
- Design and maintain a federated data lakehouse with full provenance/versioning, attribute‑based access control, license/consent automation, and agent telemetry services
- Implement automated, AI‑mediated ingestion pipelines for heterogeneous sources (HPC simulation outputs, experimental instruments, robotics, sensor streams, satellite imagery, production logs)
- Enforce Data Security & Assurance
- Develop a Data Health & Threat program: dataset fingerprinting, watermarking, poisoning/anomaly detection, red‑team sampling, and reproducible training manifests
- Configure secure enclaves and egress processes for CUI, Restricted Data, and other sensitive corpora with attestation and differential‑privacy where required
- Define and Implement Data Governance
- Establish FAIR‑compliant metadata standards, data catalogs, and controlled‑vocabulary ontologies
- Automate lineage tracking, quality checks, schema validation, and leak controls at record‑level granularity
- Instrument AI Workflows with Standardized Telemetry
- Deploy Agent Trace Schema (ATS) and Agent Run Record (ARR) frameworks to log tool calls, decision graphs, human hand‑offs, and environment observations
- Treat agent‑generated artifacts (plans, memory, configurations) as first‑class data objects
- Collaborate Across Pillars
- Work with Models and Interfaces teams to integrate data services into training, evaluation, and inference pipelines
- Partner with Infrastructure engineers to optimize data movement, tiered storage, and high‑bandwidth networking (ESnet) between HPC, cloud, and edge
- Engage domain scientists and mission leads for agile deterrence, energy grid, and critical materials use cases to curate problem‑specific datasets
- Support Continuous Acquisition & Benchmarking
- Design edge‑to‑exascale data acquisition systems with robotics and instrument integration
- Develop data/AI benchmarks
¿ datasets, tools, and metrics
¿ for pipeline performance, model evaluation, and mission KPIs
- Author an AI‑mediated parser for a new experimental instrument, automatically extracting and cataloging metadata
- Implement an attribute‑based policy that blocks unapproved data combinations in a classified enclave
- Prototype a streaming pipeline that feeds live sensor data from a nuclear facility into an HPC training queue
- Develop a dashboard that alerts on…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).