×
Register Here to Apply for Jobs or Post Jobs. X

PySpark Big Data Senior Developer - Vice President

Job in New York, New York County, New York, 10261, USA
Listing for: Citi
Full Time position
Listed on 2026-06-19
Job specializations:
  • Software Development
    Cloud Engineer - Software, Database Engineering, SQL Developer, Senior Developer
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below
Location: New York

PySpark Big Data Senior Developer - Vice President

Apply (opens in new window)

Job Req :

Location(s):

Mississauga, Ontario, Canada

Job Type:

Hybrid

Posted:

Apr. 23, 2026

Job Overview

We are building an A-team of highly skilled and autonomous engineers, and we are seeking an exceptional PySpark Big Data Senior Developer to join our dynamic and focused squads. This role is for a hands‑on player/coach who thrives in a high‑autonomy environment, is deeply committed to leveraging AI for maximum productivity, and possesses a profound understanding of the functional domains our work impacts.

The ideal candidate will be instrumental in designing, developing, and optimizing large‑scale data processing solutions using PySpark and cutting‑edge big data technologies. We are looking for an AI‑first thinker who can raise the bar, coach others, and strategically contribute to our evolving technology landscape.

Responsibilities:

  • Operate end‑to‑end in the design, development, and implementation of robust big data solutions, ensuring optimal performance, scalability, data quality, and security.
  • Collaborate closely within small, co‑located squads (4-7 person teams), fostering high communication and low coordination overhead, to translate complex business requirements into technical specifications for big data processing and analytical solutions.
  • Act as a player/coach within the team, mentoring junior members and leading by example in the development of efficient and innovative big data architectures.
  • Design, develop, and optimize large‑scale data pipelines using PySpark for data ingestion, transformation, and aggregation, always with an eye towards efficiency and domain relevance.
  • Implement and manage real‑time data streaming and event‑driven architectures using technologies like Apache Kafka.
  • Design and implement sophisticated data warehousing solutions and dimensional models for efficient data storage and retrieval, ensuring alignment with business needs.
  • Work with various distributed data storage technologies, including distributed file systems (e.g., HDFS, S3) and No

    SQL databases (e.g., MongoDB, Cassandra), selecting the right tool for the right problem.
  • Implement efficient data processing and storage strategies to optimize the performance and scalability of big data applications, with a strong focus on the “why” behind the technology choices.
  • Champion best practices in software development, including rigorous code reviews, implementing comprehensive testing, and supporting continuous integration and continuous deployment (CI/CD) pipelines.
  • Demonstrate high autonomy and agency in driving projects forward, making informed decisions, and proactively identifying areas for improvement.
  • Proactively leverage and contribute to the development of AI‑powered development tools, including internal Citi AI tools like Copilot, Claude Code, Codex, and Antigravity, to significantly enhance productivity, code quality, and accelerate development cycles.
  • Lead technical discussions and contribute strategically to the evolution of our big data technology stack, always seeking innovative approaches.
  • Troubleshoot and resolve complex technical issues within big data environments, demonstrating strong analytical and problem‑solving skills.

Required

Skills & Experience:

  • Experience: 6+ years of extensive, hands‑on experience as a Senior Big Data Developer, with a strong emphasis on Py Spark and the Apache Spark ecosystem, operating as a player/coach.
  • Programming

    Languages:

    • Expert proficiency in Python, with a proven track record of developing robust, scalable, and high‑performance PySpark applications for large‑scale data processing.
  • Big Data Frameworks/Technologies:
    • Deep understanding and extensive hands‑on experience with Apache Spark (Spark Core, Spark SQL, Spark Streaming) and its ecosystem.
    • Experience with distributed computing frameworks such as Hadoop (HDFS, YARN).
  • Data Storage & Warehousing:
    • Expert proficiency in SQL and extensive experience with data warehousing concepts and technologies (e.g., Hive, Snowflake, Redshift, Databricks SQL).
    • Proven experience with various data storage formats (e.g., Parquet, ORC, Avro) and data lake solutions (e.g., Delta…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary