More jobs:
Data Engineer
Job in
Phoenix, Maricopa County, Arizona, 85003, USA
Listed on 2026-06-01
Listing for:
JPC TECHNO INC
Full Time
position Listed on 2026-06-01
Job specializations:
-
IT/Tech
Data Engineer, Big Data
Job Description & How to Apply Below
Data Engineer with strong PySpark experience to work on large-scale data processing and analytics initiatives. The ideal candidate will have hands‑on experience working with large datasets, complex joins, and performance optimization
, along with the ability to apply basic analytical thinking and deliver clear, stakeholder‑ready outputs
.
- Design, develop, and maintain scalable data pipelines using Py Spark .
- Write efficient and optimized PySpark code to process and transform large‑scale datasets
. - Handle joins across multiple large databases
, ensuring performance, accuracy, and scalability. - Optimize Spark jobs to minimize runtime, memory usage, and compute cost
. - Work with structured and semi‑structured data from multiple sources.
- Build and curate training and analytical datasets by joining and transforming multiple data sources.
- Apply basic analytical skills to understand data patterns, anomalies, and business relevance.
- Perform data validation and quality checks
:- Record counts and reconciliation
- Null and outlier checks
- Schema and data‑type validation
- Ensure datasets are analysis‑ready and trustworthy
.
- Understand business objectives and translate them into data requirements.
- Ask the right questions to determine:
- Level of aggregation required
- Data freshness and accuracy expectations
- Preferred output and reporting formats
- Present results and insights clearly to stakeholders.
- Create reports and summaries using Excel for business users and leadership.
Candidates are expected to demonstrate the ability to:
- Approach complex data projects methodically, starting with:
- Understanding business objectives
- Reviewing source data structure and volume
- Designing efficient join strategies
- Choosing the right join types, partitioning strategies, and caching techniques
- Validating data at every stage of the pipeline
- Balancing technical accuracy with business usability when presenting results
- Strong hands‑on experience with Py Spark
- Extensive experience working with large datasets
- Proven expertise in joining large databases efficiently
- Ability to write high‑performance, optimized code
- Basic analytical skills to interpret and validate data
- Experience in model development or supporting analytics/modeling teams
- SAS experience
- Exposure to Cloudera or similar big data platforms
- Understanding of data warehousing and analytics workflows
Strong problem‑solving and logical thinking
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×