Junior Geospatial Data Scientist; Pipeline & Algorithm Focus
Listed on 2026-06-19
-
Engineering
AI Engineer (Applied/Software)
Location
San Francisco
Employment TypeFull time
Location TypeOn-site
DepartmentSoftware Engineering
About Matter IntelligenceWelcome to Matter, where we are building the future of vision AI: pairing a world-first sensor that sees molecular chemistry, temperature, and 3D shape with a Large World Model that will be the most powerful intelligence engine for our physical world. This system doesn't just see what something looks like; it understands everything from a single pixel. We call this Superintelligent Vision.
You'll join a team that has delivered technologies to Mars for NASA/JPL, co-founded and led infrastructure for OpenAI, designed cutting-edge sensors for U.S. Defense, and invented core algorithms for spectral and 3D imaging. We've come together to build the next infrastructure for vision and intelligence in the physical world.
About the RoleWe are looking for a Junior Geospatial Data Scientist to build and maintain the processing pipelines and algorithms that turn our satellite's raw hyperspectral imagery into science-grade data products. This is not a pure research role—you will write production code that runs at scale, and you'll need the math and physics intuition to know why each step in the pipeline works, not just how.
You'll work alongside senior remote sensing scientists and software engineers to implement radiometric corrections, atmospheric compensation, spectral analysis algorithms, and geospatial data transformations. If you're the kind of person who can derive a reflectance equation on a whiteboard and then implement it as a clean, tested Python module—this role is for you.
Key ResponsibilitiesPipeline Development
Build, test, and maintain scalable data processing pipelines for satellite imagery—from raw sensor data through calibrated, orthorectified, analysis-ready products.
Write modular, well-documented Python code that runs reliably in cloud environments (AWS), not one-off notebooks.
Implement radiometric calibration, atmospheric correction, geometric orthorectification, and spectral resampling stages within automated workflows.
Algorithm Implementation
Implement and optimize spectral analysis algorithms including classification, unmixing, regression, and retrieval methods for hyperspectral data.
Translate mathematical and physical models (e.g., radiative transfer, Beer-Lambert law, spectral mixture analysis) into performant, validated code.
Benchmark algorithm accuracy against ground truth and reference datasets.
Data Engineering
Manipulate complex raster data at scale using GDAL, Rasterio, Xarray, and related geospatial libraries.
Work with high-dimensional spectral cubes—understanding data structures, coordinate systems, and metadata conventions (e.g., ENVI, GeoTIFF, NetCDF/HDF).
Optimize data I/O, memory management, and compute for large imagery datasets.
Scientific Rigor
Apply your understanding of remote sensing physics (reflectance, radiance, atmospheric effects, sensor response functions) to ensure every pipeline stage is scientifically sound.
Participate in calibration/validation efforts, comparing algorithm outputs against known references.
Flag and investigate anomalies—understanding when results look wrong and diagnosing whether the issue is data, code, or physics.
Required
B.S. or M.S. in Applied Mathematics, Statistics, Physics, or a closely related quantitative field.
Hands‑on experience with hyperspectral or multispectral remote sensing data (spectral cubes, high‑dimensional imagery).
Strong Python proficiency with an engineering mindset—you write modular, testable, version‑controlled code, not just scripts.
Understanding of the math and physics behind remote sensing: reflectance, atmospheric correction, radiative transfer basics, spectral analysis.
Experience with geospatial data tools: GDAL, Rasterio, Xarray, Num Py, Sci Py.
Comfort working with raster data formats and coordinate reference systems.
Preferred
Experience building data pipelines in cloud environments (AWS S3, EC2, Lambda, Batch).
Familiarity with atmospheric correction models (MODTRAN, 6S, FLAASH) or radiative transfer concepts.
Exposure to ML frameworks (PyTorch, Tensor Flow, scikit‑learn) applied to…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).