Data Engineer Job Noida area,Uttar Pradesh India,IT/Tech

Job Description:

Data Engineering Specialist (PySpark / Databricks) – Noida, India     Company:
SII Group

Location:

Noida, India

Employment Type:

Full-time
Role Type:
Data Engineer / Senior Data Engineer (depending on experience)
About SII Group USA
SII Group USA is part of the global SII Group, a leading provider of engineering, consulting, and digital transformation services. Our teams partner with global enterprises to design, build, and scale high-performance technology solutions. We are expanding our data engineering capabilities in India and seeking passionate technologists committed to excellence, innovation, and delivery velocity.

Role Overview
We are looking for an experienced   Data Engineering Specialist   with hands-on expertise in   PySpark, Databricks, Delta Lake, Cloud Integration (GCP/AWS), CI/CD, IaC, and data modeling   . The ideal candidate is comfortable building scalable data pipelines, optimizing performance, implementing governance standards, and driving client delivery acceleration in a multi-cloud environment.

Key Responsibilities
1. Data Engineering & Lakehouse Development
Design, build, and maintain scalable data pipelines using   PySpark   and   Databricks   .
Implement and optimize   Delta Lake / Lakehouse   architectures for high-volume and real-time workloads.
Ensure robust data ingestion, transformation, and export workflows with scalable orchestration.
2. Cloud Integration (GCP & AWS)
Develop integrations between   GCP services   , Databricks, and enterprise systems.
Utilize AWS services for complementary workloads (S3, Lambda, EC2, IAM, API Gateway, etc.).
Manage hybrid cloud data flows and cross-platform integrations securely and efficiently.
3. API / REST-Based Data Export
Build and manage   API/REST-based data export triggers   and automated delivery pipelines.
Architect and optimize data exposure layers for downstream consumption and client interfaces.
4. Infrastructure as Code & Dev Ops
Implement   IaC   using   Terraform   for cloud resource provisioning and environment management.
Develop and maintain   CI/CD pipelines   for code deployments, Databricks jobs, and infrastructure automation.
Ensure repeatable, scalable, and compliant deployment workflows.
5. Data Modeling & SQL Optimization
Design logical and physical   data models   , export structures, and schema standards.
Write and tune complex   SQL   queries with a focus on performance lement best practices for partitioning, caching, indexing, and cost optimization.
6. Security, Governance & IAM
Apply   data governance   best practices across metadata, lineage, quality, and access control.
Configure and manage   IAM   , cluster security, encryption, credential handling, and audit logging.
Ensure compliance with enterprise security policies and client requirements.
7. Performance, Scalability & Reliability
Optimize ETL/ELT workflows for cost, latency, and throughput.
Implement monitoring, alerting, and auto-scaling strategies across cloud platforms.
Collaborate with architecture teams to enable long-term scalability and resilience.
8. Client Delivery & Ramp-Up Support
Support rapid client onboarding and   delivery velocity enhancement   .
Collaborate closely with product owners, project managers, and client stakeholders.
Provide guidance to junior engineers, perform code reviews, and help build best-practice frameworks.
Required Skills & Experience
4–10 years of experience in   data engineering   (range can be adjusted).
Strong hands-on experience with   PySpark   and   Databricks   .
Proven expertise in   Delta Lake / Lakehouse   implementations.

Experience with both   GCP   (Big Query, GCS, Pub/Sub, IAM, etc.) and   AWS   (S3, Lambda, Glue, Redshift, etc.).
Proficiency in   Terraform   ,   CI/CD tools   (Azure Dev Ops, Git Hub Actions, Git Lab CI, Jenkins, etc.).
Strong SQL skills, including performance tuning and query optimization.

Experience with API integrations, REST services, and automated triggers.
Understanding of data security, governance frameworks, and IAM policies.
Experience working in agile delivery teams, with client-facing exposure preferred.

Preferred Qualifications
Databricks certifications (Data Engineer Associate/Professional).
GCP or AWS cloud certifications.
Experience supporting enterprise-scale data programs.
Knowledge of data quality frameworks (e.g., Deequ, Great Expectations).

What We Offer
Opportunity to work with global clients and cutting-edge data technologies.
Collaborative culture with strong focus on innovation.
Competitive compensation, continuous learning opportunities, and global career mobility


Increase/decrease your Search Radius (miles)



Job Posting Language