Data Engineer Consultant Job Jeddah area,Saudi Arabia,IT/Tech

The Job Description

Enterprise Data Ingestion and Data Engineering (1-):
Design and build scalable, reusable ingestion pipelines (realtime and batch) on GCP (GCS, Big Query, Dataflow). Develop parameterized pipelines using Google-native services and/or Informatica IDMC mappings and taskflows. Implement CDC patterns, idempotent loads, late‑arriving data handling, and schema evolution. Optimize Big Query ingestion strategies (batch vs. streaming, partitioning, clustering). Establish version control, CI/CD, and environment promotion (dev/test/prod).

Results

GCP-native pipelines (Dataflow, Composer, Cloud Run) and/or IDMC mappings/taskflows.
Big Query schema designs (raw / stage / curated zones with partitioning & clustering).
CI/CD pipelines, deployment artifacts, and operational runbooks.

Data Mapping & Transformation Design (2-)

Profile source data and analyze data quality and patterns. Design field‑level mappings, business rules, joins, aggregations, and derivations. Implement transformations using Big Query SQL, Dataflow, or IDMC transformation logic.

Results

Source‑to‑Target Mapping (STM) document.
Transformation mappings.

Orchestration, Automation & Reliability Engineering (3-)

Build and parameterize end‑to‑end workflows using Cloud Composer (Airflow) and/or IDMC taskflows. Define job dependencies, schedules, SLAs, and failure handling strategies. Implement retries, backoff strategies, checkpoints, and restartability. Integrate monitoring, logging, and alerting using Cloud Monitoring, Cloud Logging, and Chat Ops tools.

Results

Production‑ready orchestration workflows with environment‑aware parameters.
Scheduling calendar, dependency diagrams, and SLA matrix.
Alerting, monitoring dashboards, and operational SOPs.

Data Pipelines Monitoring, Performance & Cost Optimization (4-)

Monitor pipeline health, data freshness, volumes, and anomaly patterns. Support and monitor data pipelines during off‑hours and weekends, troubleshooting and resolving issues to ensure SLA compliance. Track and manage SLAs related to runtime, failures, cost, and data latency. Optimize Big Query performance (query refactoring, partition pruning, materialized views). Manage and optimize GCP costs (storage lifecycle, slot usage, query optimization, caching).

Results

Daily/weekly operational and SLA reports.
Performance and cost optimization plans with before/after metrics.
Big Query tuning guidelines and data materialization strategy.

Requirements

5+ years of experience as a Data Engineer or ETL Developer.
Strong experience with Google Cloud Platform:
Big Query, GCS, Data Stream, Dataflow, Cloud Composer, IAM, Cloud Monitoring.
Proficiency in SQL (Big Query‑optimized) and data modeling for analytics.
Hands‑on experience building production‑grade, scalable data pipelines.
Experience with CI/CD, Git‑based version control, and Dev Ops practices.
Optional experience with Informatica Intelligent Data Management Cloud (IDMC) CDI, Mass Ingestion, Taskflows.
Bachelor’s degree in Computer Science, Information Technology, Information Systems, or a related field.

#J-18808-Ljbffr