Apache Spark Developer Security Clearance Job Herndon area,Virginia USA,IT/Tech

Position: Apache Spark Developer with Security Clearance
Career Opportunities with Absolute Business Solutions Corp A great place to work. Careers At Absolute Business Solutions Corp Current job opportunities are posted here as they become available. Back To Openings Apache Spark Developer Office:

Virginia

Location:

Herndon, VA START YOUR APPLICATION Absolute Business Solutions Corp (ABSC) is not just another tech company. We’re a community of innovators, engineers, analysts and business professionals working together with our customers to tackle the most complex challenges. For more than 20 years we’ve supported critical DoD, IC and Federal Civilian missions and global, multi-national corporations. We specialize in supporting our clients in the Intelligence, Technology, Defense, AI/ML, and Data Science fields.

As we continue to grow at a rapid pace, we are seeking some amazing new professionals to join our team. We are actively hiring a TS/SCI-cleared Apache Spark Developer to support NGA’s Data Modernization Services (DMS) mission by building and optimizing large-scale data processing pipelines. This role focuses on developing high-performance Spark applications within a containerized, Kubernetes-based environment, supporting mission analytics, data exploitation, and AI/ML integration.

The ideal candidate thrives in distributed data environments, understands performance tuning deeply, and can operate effectively in secure, air-gapped systems. This role is on-site/flexible hours in Herndon, VA;
Springfield, VA;
St. Louis, MO; or Aurora, CO. Clearance Required for this role: TS/SCI eligibility with willingness/ability to obtain CI polygraph. Core Technology Stack Data / Processing
* Apache Spark (PySpark, Scala)

* Delta Lake, Parquet

* Structured Streaming
Infrastructure
* Kubernetes (execution environment)

* Docker
Storage / Cloud (Abstracted)
* S3 / object storage

* AWS / GCP / Azure (environment-dependent)
Dev Ops (Exposure Level)
* Git, Jenkins (CI/CD)
Languages
* Python (PySpark)

* Scala (preferred)

* Bash / scripting

Key Responsibilities
* Design, develop, and maintain Apache Spark pipelines (batch and streaming) using PySpark and/or Scala

* Process and transform large-scale datasets using modern data lake architectures (Delta Lake, Parquet)

* Optimize Spark jobs for performance, including:
o Partitioning strategies o Shuffle optimization o Memory tuning o File sizing and storage efficiency
* Implement Structured Streaming pipelines for near real-time data processing

* Develop and deploy Spark applications within containerized environments (Docker)

* Execute workloads in Kubernetes clusters, supporting scalable and distributed processing

* Integrate Spark pipelines with downstream systems, including:
o Analytics platforms (SQL, notebooks) o AI/ML workflows and feature engineering pipelines
* Support data ingestion and storage in object-based systems (e.g., S3-compatible storage)

* Troubleshoot data pipeline failures and ensure reliability in mission-critical environments

* Operate within secure, air-gapped environments, including:
o Managing dependencies without internet access o Working within controlled network and security constraints

Required Qualifications:

* TS/SCI (eligibility) with ability/willingness to obtain/maintain counterintelligence polygraph

* Bachelor’s degree plus 5 years’ experience in data engineering or Spark development (will entertain additional years’ experience in lieu of degree)

* Strong hands-on experience with:
o Apache Spark (mandatory) o Python (PySpark) o Data processing at scale
* Experience working with:
o Parquet and/or Delta Lake o Distributed data systems
* Familiarity with:
o Docker / containerization o Kubernetes (basic to intermediate experience)
* Experience with object storage systems (e.g., S3 or equivalent)

* Strong troubleshooting and performance tuning skills

* Proficiency in Bash or scripting

Preferred Qualifications:

* Experience with Scala for Spark development

* Experience with Structured Streaming in production environments

* Familiarity with Iceberg or lakehouse architectures

* Experience with CI/CD pipelines (Jenkins, Git)

* Exposure to Terraform or Infrastructure as Code

* Experience supporting AI/ML data pipelines

* Prior…