Hadoop Spark Data Engineer Job London England UK,IT/Tech

Hadoop Spark Data Engineer - London Reference Code:

- Type:

Permanent Professional Communities:
Data & AI

About the Job you are considering:

Results-driven Hadoop Spark Data Engineer with strong expertise in designing and implementing scalable big data solutions using Scala and Apache Spark. Experienced in working with the Hadoop ecosystem, including HDFS, Hive, and YARN, to process and analyze large datasets efficiently. Skilled in building robust ETL pipelines, real-time data processing, and optimizing distributed systems for performance and reliability. Proficient in SQL, data modeling, and integrating data from multiple sources in cloud and on-prem environments.

Hybrid working:

The places that you work from day to day will vary according to your role, your needs, and those of the business; it will be a blend of Company offices, client sites, and your home; noting that you will be unable to work at home 100% of the time.

Your Role:

Design and develop Hadoop based applications and data pipelines.
Build operate monitor and troubleshoot Hadoop clusters.
Write scalable ETL processes using tools like Hive Pig and Spark.
Develop and maintain data ingestion processes using Sqoop Flume or Kafka.
Optimize Map Reduce jobs and manage HDFS storage.
Collaborate with data scientists and analysts to support data needs.
Ensure data security and compliance with organizational policies.
Create and maintain technical documentation and playbooks.
Evaluate and integrate cloudbased big data solutions AWS GCP Azure.

Your

Skills:

Proficient in Scala programming with strong expertise in functional programming concepts for building scalable data applications.
Extensive experience in Apache Spark (Core, SQL, and Streaming) for processing large-scale distributed data efficiently
Strong knowledge of Hadoop ecosystem components including HDFS, YARN, Hive, and HBase.
Skilled in designing and developing ETL pipelines and handling structured and unstructured big data.
Experienced in performance tuning, data optimization, and working with distributed systems in cloud or on-prem environments.

We are a Disability Confident

Employer:

Capgemini is proud to be a