×
Register Here to Apply for Jobs or Post Jobs. X
More jobs:

Senior Data Engineer; AWS — Metadata & Vector Embeddings

Job in Caerphilly, Caerphilly County, CF83, Wales, UK
Listing for: Zensar Technologies
Full Time position
Listed on 2026-05-30
Job specializations:
  • Software Development
    Data Engineer
Job Description & How to Apply Below
Position: Senior Data Engineer (AWS) — Metadata & Vector Embeddings

The overall technical lead and architect. Designs the metadata schema, builds the simulation onboarding pipeline, deploys metadata embedding pipeline and Open Search k‑N‑N vector store, and authors data export format spec for AI/ML use case. This role is the deepest technical seat on the engagement:

Key responsibilities on this engagement
  • Run the Sprint 1 architecture review of the existing UAT codebase (S3 + Glue + S3 Tables + Open Search + Athena) and deliver written gap findings.
  • Design the metadata schema, taxonomy, and field catalogue (Light, Brain, Power).
  • Tune data orchestration — Glue jobs, Athena queries, S3 Tables config, scheduling. Lead the deep‑dive technical sessions with analysts on visualization requirements
  • Build and validate the simulation data onboarding pipeline against real data — including the 30 GB‑per‑run acoustic spectra dataset.
  • Configure and validate the Open Search k‑N‑N vector store and the Bedrock embedding pipeline.
  • Author the AI/ML data export format specification and the AI onboarding pattern document.
  • Co‑design the API middleware blueprint with the Cloud Infrastructure Architect.
Must Have:
  • Principal‑level hands‑on data engineering on AWS — 7+ years
  • Deep production experience with S3, S3 Tables, Glue, Athena, and Open Search
  • Built and shipped vector embedding workloads
  • Strong metadata modelling and data taxonomy design experience for scientific
  • Comfort working with Parquet, JSON‑LD, and large binary scientific data formats (mesh, time‑series, spectra)
  • Python proficiency;
    PySpark / Glue job tuning experience
Nice‑to‑have / differentiators
  • Familiarity with surrogate model training data pipelines
  • Experience with Sage Maker Unified Studio or comparable governed data‑mesh tooling (in case of required integration)
  • Published or contributed to AWS data architecture patterns or blueprints
#J-18808-Ljbffr
Position Requirements
10+ Years work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary