Data Warehouse Engineer - Intern Job Dubai area,Dubai UAE/Dubai,IT/Tech

In a world of possibilities, pursue one with endless opportunities. Imagine Next! At Parsons, you can imagine a career where you thrive, work with exceptional people, and be yourself. Guided by our leadership vision of valuing people, embracing agility, and fostering growth, we cultivate an innovative culture that empowers you to achieve your full potential. Unleash your talent and redefine what’s possible.

Job Description

Position Overview

Parsons is seeking a high-potential Data Engineer Graduate Intern to join our Technology and Innovation team. This role is designed for candidates with strong analytical foundations and an interest in building scalable, enterprise-grade data platforms that support operational, engineering, and executive decision-making.

Key Responsibilities Data Processing

Work with frameworks like Apache Spark, Hadoop, or Apache Beam to process large datasets efficiently.
Support development of batch and streaming data pipelines using Python and distributed processing frameworks as Apache Spark (Databricks)
Assist in processing and transforming structured and semi-structured data at scale

ETL/ELT Implementation

Assist in designing and implementing ETL/ELT processes for data integration and transformation.
Contribute to the design and implementation of ETL/ELT workflows using Azure Data Factory, Databricks, or equivalent tools
Support data ingestion from multiple sources (databases, APIs, files, cloud storage)

Cloud Integration & Platform (Microsoft Azure)

Work with Azure-native data services, including:
- Azure Data Factory
- Azure Synapse Analytics
- Azure Data Lake Storage (ADLS Gen2)
- Azure Databricks
Utilize cloud services such as Azure (Data Factory, Synapse, Data Lake), AWS (S3, Redshift, Glue), or Google Cloud Platform (Big Query, Dataflow) for data storage and processing.
Support secure configuration of cloud resources, access controls, and data storage

Database Management

Manage and query relational databases (e.g., Postgre

SQL, MySQL, Oracle) and No

SQL databases (e.g., Mongo

DB, Cassandra, Dynamo

DB).
Query and manage relational databases (Azure SQL, SQL Server, Postgre

SQL, MySQL)
Support analytics and reporting use cases using modern data warehouse / lakehouse architectures

Data Warehousing

Support the development and optimization of modern data warehouse solutions like Databricks, Snowflake, Redshift, or Big Query.

Pipeline Orchestration

Build and manage workflows using orchestration tools like Apache Airflow, Prefect, or Luigi.
Assist with workflow orchestration using tools such as Azure Data Factory pipelines or Apache Airflow (where applicable)
Support scheduling, monitoring, and failure handling of data pipelines

Big Data Tools

Work with distributed data systems and storage solutions like HDFS or cloud-native equivalents.

Version Control

Collaborate with the team using Git for code versioning and management.

Debugging and Optimization

Diagnose and resolve performance issues in data systems and optimize database queries.

Dev Ops, Quality & Optimization

Collaborate using Git-based workflows (Azure Dev Ops Repos or Git Hub)
Support data quality checks, performance tuning, and query optimization
Assist with documentation of data pipelines, schemas, and system design

Technical Requirements Skill Area Requirements Programming

Proficiency in Python
- Experience with scripting languages for automation
Solid understanding of SQL for data querying and transformation

Data Processing Frameworks

Hands-on experience with Apache Spark, Hadoop, or Apache Beam
- Familiarity with ETL/ELT processes
Understanding of ETL / ELT concepts and data pipeline design

Database and Querying

Strong understanding of SQL

- Experience with relational databases (Postgre

SQL, MySQL, Oracle)
Experience with No

SQL databases (Mongo

DB, Cassandra, Dynamo

DB)

Cloud Platforms

Familiarity with Microsoft Azure data services
Azure Data Factory
Azure Synapse Analytics
Azure Data Lake
Azure Databricks
Awareness of Azure security and identity concepts (RBAC, managed identities) is advantageous

Data Warehousing

Experience with Databricks, Snowflake, Redshift, or Big Query

Data Pipelines and Orchestration

Knowledge of tools like Apache Airflow, Prefect, or Luigi

Big Data Tools

Experience with distributed data systems and storage solutions like HDFS

Version Control

Proficiency with Git for code versioning and collaboration

Preferred Qualifications

Exposure to Azure Dev Ops or Git Hub Actions
Familiarity with Agile / Scrum delivery environments
Interest in enterprise analytics, cloud platforms, and data governance
Awareness of data privacy and governance principles (e.g., GDPR concepts)
Note:

Multi-cloud exposure (AWS / GCP) is beneficial but not required. The primary environment is Microsoft Azure.
Experience:

Practical exposure to building and optimizing scalable data pipelines, batch and real-time data processing.
Debugging:
Familiarity with diagnosing and resolving performance issues in data systems.
Data Governance:
Understanding of data privacy regulations (e.g., GDPR, CCPA) and experience…


Increase/decrease your Search Radius (miles)



Job Posting Language