PySpark Developer Job Hartford area,Connecticut USA,IT/Tech

We are seeking a highly skilled and experienced Python and PySpark Developer to join our team. The ideal candidate will be responsible for designing, developing, and optimizing big data pipelines and solutions using Python, PySpark, and distributed computing frameworks. This role involves working closely with data engineers, data scientists, and business stakeholders to process, analyze, and derive insights from large-scale datasets.

Key Responsibilities

Design and implement scalable data pipelines using PySpark and other big data frameworks.
Develop reusable and efficient code for data extraction, transformation, and loading (ETL).
Optimize data workflows for performance and cost efficiency.

Data Analysis & Processing

Process and analyze structured and unstructured datasets.
Build and maintain data lakes, data warehouses, and other storage solutions.

Collaboration & Problem Solving

Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.
Troubleshoot and resolve performance bottlenecks in big data pipelines.

Code Quality & Documentation

Write clean, maintainable, and well-documented code.
Ensure compliance with data governance and security policies.

Required

Skills & Qualifications Programming Skills

Proficient in Python with experience in data processing libraries like Pandas and Num Py.
Strong experience with PySpark and Apache Spark.
Hands‑on experience with big data platforms such as Hadoop, Databricks, or similar.
Familiarity with cloud services like AWS (EMR, S3), Azure (Data Lake, Synapse), or Google Cloud (Big Query, Dataflow).

Database Expertise

Strong knowledge of SQL and No

SQL databases.
Experience working with relational databases like Postgre

SQL, MySQL, or Oracle.

Data Workflow Tools

Experience with workflow orchestration tools like Apache Airflow or similar.

Problem Solving & Communication

Ability to solve complex data engineering problems efficiently.
Strong communication skills to work effectively in a collaborative environment.

Preferred Qualifications

Knowledge of data Lakehouse architectures and frameworks.
Familiarity with machine learning pipelines and integration.
Experience in CI/CD tools and Dev Ops practices for data workflows.
Certification in Spark, Python, or cloud platforms is a plus.

Education

Bachelors or Masters degree in Computer Science, Data Engineering, or a related field.

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language