×
Register Here to Apply for Jobs or Post Jobs. X

Hadoop Spark Data Engineer

Remote / Online - Candidates ideally in
London, Greater London, W1B, England, UK
Listing for: Capgemini
Remote/Work from Home position
Listed on 2026-06-06
Job specializations:
  • IT/Tech
    Data Engineer, Big Data
Job Description & How to Apply Below
Hadoop Spark Data Engineer - London Reference Code:

- Type:

Permanent Professional Communities:
Data & AI

About the Job you are considering:

Results-driven Hadoop Spark Data Engineer with strong expertise in designing and implementing scalable big data solutions using Scala and Apache Spark. Experienced in working with the Hadoop ecosystem, including HDFS, Hive, and YARN, to process and analyze large datasets efficiently. Skilled in building robust ETL pipelines, real-time data processing, and optimizing distributed systems for performance and reliability. Proficient in SQL, data modeling, and integrating data from multiple sources in cloud and on-prem environments.

Hybrid working:

The places that you work from day to day will vary according to your role, your needs, and those of the business; it will be a blend of Company offices, client sites, and your home; noting that you will be unable to work at home 100% of the time.

Your Role:

  • Design and develop Hadoop based applications and data pipelines.
  • Build operate monitor and troubleshoot Hadoop clusters.
  • Write scalable ETL processes using tools like Hive Pig and Spark.
  • Develop and maintain data ingestion processes using Sqoop Flume or Kafka.
  • Optimize Map Reduce jobs and manage HDFS storage.
  • Collaborate with data scientists and analysts to support data needs.
  • Ensure data security and compliance with organizational policies.
  • Create and maintain technical documentation and playbooks.
  • Evaluate and integrate cloudbased big data solutions AWS GCP Azure.

Your

Skills:

  • Proficient in Scala programming with strong expertise in functional programming concepts for building scalable data applications.
  • Extensive experience in Apache Spark (Core, SQL, and Streaming) for processing large-scale distributed data efficiently
  • Strong knowledge of Hadoop ecosystem components including HDFS, YARN, Hive, and HBase.
  • Skilled in designing and developing ETL pipelines and handling structured and unstructured big data.
  • Experienced in performance tuning, data optimization, and working with distributed systems in cloud or on-prem environments.

We are a Disability Confident

Employer:

Capgemini is proud to be a

Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary