Data Intern
Listed on 2026-06-04
-
IT/Tech
Data Engineer, AI Engineer, Data Analyst, Machine Learning/ ML Engineer
We're looking for a Geospatial Data Engineering Intern to help build and scale our geospatial data infrastructure over the summer. This role is designed for a junior or senior undergraduate with at least one prior internship under their belt, strong data engineering instincts, and the team awareness to ship work in a shared production codebase. You'll get hands‑on experience building pipelines that ingest, transform, and serve geospatial data with exposure to AI agent tooling along the way.
This role begins as a full‑time, 3-month summer internship and then continues part‑time through September
.
You'll work directly with our data team, contributing to operational infrastructure that powers geospatial analysis and decision‑making across the organization. Your primary focus will be building reliable, well‑documented data pipelines with a geospatial backbone, while getting meaningful exposure to applied AI systems and helping us complete an in‑flight migration from Airflow 2 to Airflow 3.
About your role at ReadyYou’ll spend the majority of your time on geospatial data engineering, with supporting work in geospatial analysis and applied AI.
Geospatial Data Engineering (Primary Focus)- Build and improve Airflow ELT pipelines that ingest, transform, and serve geospatial datasets at scale, working across both our Airflow 2 and Airflow 3 repositories and actively assisting with the Airflow 2 to 3 migration, including porting DAGs, validating parity, and helping retire legacy pipelines
- Write clean, type‑hinted Python and well‑structured SQL, including geospatial operations (PostGIS, spatial joins, CRS management) against Athena (Trino), Postgre
SQL, Redshift and DuckDB - Develop modular dbt models with semantic layer definitions and documented business logic for geospatial tables
- Contribute to data quality systems, including schema validation, freshness monitoring, and spatial integrity checks
- Support Data Hub adoption through schema documentation, lineage tracking, and metadata management for geospatial assets
- Triage failing DAG runs, read Airflow task logs, and own fixes end‑to‑end
- Communicate progress through documentation, code reviews, and regular updates
- Contribute to research‑oriented analyses such as tree canopy classification, network resiliency analysis, and spatial feature extraction
- Design and document reproducible analytical workflows that feed into production pipelines
- Translate complex geospatial methods into clear, accessible outputs for non‑technical stakeholders
- Share learnings on emerging GeoAI methods and geospatial tooling with the team
- Assist with building data agents using tools like Lang Graph, Lang Chain, or Bedrock Agent Core
- Support development and iteration on pipelines and text‑to‑SQL approaches for natural‑language data access
- Contribute to MCP server development and agent evaluation as needed
- Document agent failure modes and help refine prompts based on feedback
- Currently a junior or senior undergraduate (or higher) in Computer Science, Data Science, GIS, Geospatial Engineering, Software Engineering, or a related field
- At least one prior internship (or equivalent team‑based engineering experience); you've shipped code in a shared repo, taken a code review, and worked a ticket end‑to‑end
- Available to work full‑time for 3 months during the summer, then part‑time through the fall semester
- Strong fundamentals in Python, including classes, inheritance, decorators, type hints, and explicit imports
- Strong fundamentals in SQL: joins, CTEs, window functions, and aggregations
- Comfortable working in Git/Git Hub with a dev → main PR‑to‑deploy workflow
- Comfortable on the Unix command line or eager to learn (bash, navigating a file system, running scripts)
- Familiarity with geospatial concepts (CRS, spatial joins, indexing) and tooling such as PostGIS, Geo Pandas, or QGIS is a plus
- Exposure to AWS or similar cloud providers; ideally S3, IAM, Athena, Glue, ECS, or Redshift
- Experience with Airflow or similar orchestration tools is a plus (or strong eagerness to learn quickly. You'll be ramping on two versions in parallel and…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).