Data Engineer; Python/Pyspark Job Dallas area,Texas USA,IT/Tech

Position: Data Engineer (Python/Pyspark)

Data Engineer with strong proficiency in SQL, Python, and Py Spark to support high-performance data pipelines and analytics initiatives. This role will focus on scalable data processing, transformation, and integration efforts that enable business insights, regulatory compliance, and operational efficiency.

Data Engineer – SQL, Python and Pyspark Expert (Onsite – Dallas, TX)

Key Responsibilities

Design, develop, and optimize ETL/ELT pipelines using SQL, Python, and Py Spark for large-scale data environments

Implement scalable data processing workflows in distributed data platforms (e.g., Hadoop, Databricks, or Spark environments)

Partner with business stakeholders to understand and model mortgage lifecycle data (origination, underwriting, servicing, foreclosure, etc.)

Create and maintain data marts, views, and reusable data components to support downstream reporting and analytics

Ensure data quality, consistency, security, and lineage across all stages of data processing

Assist in data migration and modernization efforts to cloud-based data warehouses (e.g., Snowflake, Azure Synapse, GCP Big Query)

Document data flows, logic, and transformation rules

Troubleshoot performance and quality issues in batch and real-time pipelines

Support compliance-related reporting (e.g., HMDA, CFPB)

Required Qualifications

6+ years of experience in data engineering or data development

Advanced expertise in SQL (joins, CTEs, optimization, partitioning, etc.)

Strong hands-on skills in Python for scripting, data wrangling, and automation

Proficient in Py Spark for building distributed data pipelines and processing large volumes of structured/unstructured data

Experience working with mortgage banking data sets and domain knowledge is highly preferred

Strong understanding of data modeling (dimensional, normalized, star schema)

Experience with cloud-based platforms (e.g.,
Azure Databricks
, AWS EMR
, GCP Dataproc
)

Familiarity with ETL tools, orchestration frameworks (e.g., Airflow, ADF, dbt)

#J-18808-Ljbffr


Increase/decrease your Search Radius (miles)



Job Posting Language

Data Engineer; Python​/Pyspark

Data Engineer; Python/Pyspark