Data Engineer Job Scranton area,Pennsylvania USA,IT/Tech

Position: Staff Data Engineer

We're building a world of health around every individual - shaping a more connected, convenient and compassionate health experience. At CVS Health®, you'll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger - helping to simplify health care one person, one family and one community at a time.

Position

Summary

If you're eager to make a real impact in the healthcare industry through your own meaningful contributions, join us as we pave the way for technical innovation. At CVS Health, we possess an extensive repository of healthcare data spanning over 150 million individuals, providing an unparalleled foundation for ambitious engineers.

In this high-impact, high-autonomy role, you will be a technical innovator and visionary, leading the evolution of our data infrastructure. You will take a lead role in the end-to-end development of critical data self-service platforms designed to modernize how petabyte-scale data is ingested, accessed, and managed. Your work will be instrumental in shifting from traditional, ticket-driven data handling toward a Data Mesh approach, empowering data owners to take full accountability for their data quality through the robust internal tools you build.

As

a Staff Data Engineer, you will:

Architect Petabyte Pipelines:
Engineer scalable, reliable, and performant data pipelines to assemble large and intricate datasets using SQL, DBT, and Snowflake, ensuring high data availability and integrity.
Build Data Platforms:
Independently design and maintain internal React (Type Script) interfaces and Python backend services that automate data ingestion and discovery, reducing lead times for application teams from weeks to minutes.
Develop Data APIs:
Build and maintain production-grade REST and gRPC APIs that serve as the high-performance interface between our Snowflake data layer and downstream consumer touchpoints.
Modernize Data Operations:
Implement a Git Ops model for data using Git Hub Actions and Argo/Kargo, integrating standardized logging, alerting, and automated observability into the heart of all data products.
Innovate with AI:
Leverage Cursor AI, MCPs, and other AI tooling to accelerate the data engineering SDLC, from optimizing complex SQL queries to automating schema migrations.
Collaborate and Lead:
Communicate with business leaders to translate complex data requirements into functional specifications while mentoring other engineers in modern data architecture and software best practices.

Key Responsibilities

Data Architecture:
Design and optimize high-volume ETL/ELT pipelines using SQL, DBT, and Snowflake, ensuring data is modeled for both analytical and operational use cases.
Internal Tooling (Full Stack):
Develop and maintain internal-facing web applications using React that allow data owners to interact with, monitor, and configure their data pipelines.
API Development:
Architect and implement REST and gRPC APIs in Python that serve as the interface between our Snowflake data layer and downstream consumer applications.
CI/CD & Git Ops:
Own the deployment lifecycle of data services and tools using Git Hub Actions for CI and Argo/Kargo for continuous delivery and lifecycle management.
Self-Service Platforms:
Build "Data-as-a-Service" features, such as automated UI-driven ingestion workflows, reducing the reliance on manual data engineering tickets.
AI Integration:
Utilize modern AI development tools (e.g., Claude AI) to accelerate the development of both data pipelines and management interfaces.

Required Qualifications

7+ years of experience in Data Engineering with a heavy focus on Python as the primary scripting and backend language.
7+ years of experience with SQL and cloud data warehouses (e.g Snowflake, AWS, GCP, etc.)
7+ years of experience building high-volume ETL/ELT pipelines and data modeling.

Preferred Qualifications

5+ years of experience with DBT (Data Build Tools).
5+ years of experience building frontend applications with React and designing RESTful APIs.
5+ years of experience with Git Hub Actions and Git Ops-based deployment tools (e.g., Argo or Kargo).
Big Data Architecture:
High-level understanding of big data design patterns, including Data Lake, Data Mesh, and Iceberg, along with data normalization strategies.
Git Ops & Deployment:
Demonstrated experience with Argo/Kargo for Kubernetes-based deployments and advanced Git Hub Actions for workflow automation.
Messaging & Streaming:

Experience with message queuing technologies such as Kafka, SNS, or Rabbit

MQ to support real-time data movement.
AI-Enhanced Development:
Proficiency in working with Cursor AI, Git Hub CoPilot, or similar AI-driven environments to accelerate engineering cycles.
Observability:
Strong experience with metrics, logging, monitoring, and alerting tools to ensure production system reliability.
Software Fundamentals:
Strong grasp of data structures, algorithms, async…