Data Engineer Consultant
Listed on 2026-06-19
-
IT/Tech
Data Engineering, Cloud Computing: Infrastructure & Operations
The Job Description
Enterprise Data Ingestion and Data Engineering (1-):
Design and build scalable, reusable ingestion pipelines (realtime and batch) on GCP (GCS, Big Query, Dataflow). Develop parameterized pipelines using Google-native services and/or Informatica IDMC mappings and taskflows. Implement CDC patterns, idempotent loads, late‑arriving data handling, and schema evolution. Optimize Big Query ingestion strategies (batch vs. streaming, partitioning, clustering). Establish version control, CI/CD, and environment promotion (dev/test/prod).
- GCP-native pipelines (Dataflow, Composer, Cloud Run) and/or IDMC mappings/taskflows.
- Big Query schema designs (raw / stage / curated zones with partitioning & clustering).
- CI/CD pipelines, deployment artifacts, and operational runbooks.
Profile source data and analyze data quality and patterns. Design field‑level mappings, business rules, joins, aggregations, and derivations. Implement transformations using Big Query SQL, Dataflow, or IDMC transformation logic.
Results- Source‑to‑Target Mapping (STM) document.
- Transformation mappings.
Build and parameterize end‑to‑end workflows using Cloud Composer (Airflow) and/or IDMC taskflows. Define job dependencies, schedules, SLAs, and failure handling strategies. Implement retries, backoff strategies, checkpoints, and restartability. Integrate monitoring, logging, and alerting using Cloud Monitoring, Cloud Logging, and Chat Ops tools.
Results- Production‑ready orchestration workflows with environment‑aware parameters.
- Scheduling calendar, dependency diagrams, and SLA matrix.
- Alerting, monitoring dashboards, and operational SOPs.
Monitor pipeline health, data freshness, volumes, and anomaly patterns. Support and monitor data pipelines during off‑hours and weekends, troubleshooting and resolving issues to ensure SLA compliance. Track and manage SLAs related to runtime, failures, cost, and data latency. Optimize Big Query performance (query refactoring, partition pruning, materialized views). Manage and optimize GCP costs (storage lifecycle, slot usage, query optimization, caching).
Results- Daily/weekly operational and SLA reports.
- Performance and cost optimization plans with before/after metrics.
- Big Query tuning guidelines and data materialization strategy.
- 5+ years of experience as a Data Engineer or ETL Developer.
- Strong experience with Google Cloud Platform:
Big Query, GCS, Data Stream, Dataflow, Cloud Composer, IAM, Cloud Monitoring. - Proficiency in SQL (Big Query‑optimized) and data modeling for analytics.
- Hands‑on experience building production‑grade, scalable data pipelines.
- Experience with CI/CD, Git‑based version control, and Dev Ops practices.
- Optional experience with Informatica Intelligent Data Management Cloud (IDMC) CDI, Mass Ingestion, Taskflows.
- Bachelor’s degree in Computer Science, Information Technology, Information Systems, or a related field.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).