Technical Lead - Data Engineer Job Bangalore area,Bengaluru Karnataka India,IT/Tech

Location: Bengaluru

Cygnet infotech :

Job Title - Technical Lead - Data Engineering

Work Locations
- Ahmedabad / Pune / Bangalore / Vadodara

Work Mode
- Work from Office

Availability for Joining
- Immediate to 15 Days

Role Overview

We are looking for a Technical Lead to architect and deliver a greenfield Data Lakehouse platform on GCP.

This role requires technical ownership, architectural decision-making, and team leadership.

You will define standards, design fault-tolerant and build idempotent pipelines processing 4 to 5 million records per day and lead junior engineers while remaining deeply hands-on.

Experience

- 8+ years overall engineering experience
- 5+ years hands-on Big Data experience (Spark, distributed processing, cloud data platforms)
- Proven experience leading engineers and building robust orchestration of pipelines
- Experience building data platforms from scratch (greenfield) is mandatory

Key Responsibilities

Ownership & Leadership

- Define engineering standards, coding guidelines, and data quality practices
- Act as technical authority for the data platform - Make and defend architectural decisions (Iceberg design, partitioning, incremental strategies)
- Lead and mentor junior engineers (design reviews, PR reviews, technical guidance)
- Break ambiguous requirements into executable technical plans
- Partner with stakeholders to align platform design with business needs

Data Engineering

- Build and review pipelines ingesting data from databases, APIs, files
- Develop large-scale transformations using PySpark / Big Query / Databricks and dbt
- Implement Delta Lake / Apache Iceberg tables with proper schema evolution and partitioning
- Design idempotent, retry-safe pipelines ensuring data integrity across failures
- Handle logical data validation errors and runtime exceptions gracefully

Reliability & Quality

- Build fault-tolerant pipelines with safe re-runs and backfills
- Implement data validation, reconciliation, and quality checks
- Ensure no duplicate data, no partial writes, and consistent outcomes
- Design for observability, monitoring, and rapid failure recovery

Required Technical Stack

- GCP / AWS
- Google Big Query, Redshift, Databricks
- Apache Iceberg
- PySpark / Spark SQL
- dbt (Core)
- Apache Airflow
- Advanced SQL and Python


Increase/decrease your Search Radius (miles)



Job Posting Language