Apache Spark Developer
Listed on 2026-05-08
-
IT/Tech
Data Engineer
Absolute Business Solutions Corp (ABSC) is not just another tech company. We’re a community of innovators, engineers, analysts and business professionals working together with our customers to tackle the most complex challenges. For more than 20 years we’ve supported critical DoD, IC and Federal Civilian missions and global, multi-national corporations. We specialize in supporting our clients in the Intelligence, Technology, Defense, AI/ML, and Data Science fields.
ApacheSpark Developer
TS/SCI‑cleared developer to support NGA’s Data Modernization Services (DMS) mission by building and optimizing large‑scale data processing pipelines within a containerized, Kubernetes‑based environment. The role focuses on developing high‑performance Spark applications, supporting mission analytics, data exploitation, and AI/ML integration. The ideal candidate thrives in distributed data environments, has deep performance‑tuning expertise, and can operate effectively in secure, air‑gapped systems. Location choices: on‑site/flexible hours in Herndon, VA;
Springfield, VA;
St. Louis, MO; or Aurora, CO.
TS/SCI eligibility with willingness/ability to obtain a CI polygraph.
Core Technology Stack- Apache Spark (PySpark, Scala), Delta Lake, Parquet, Structured Streaming
- Kubernetes, Docker, Git, Jenkins (CI/CD)
- S3/other object storage, AWS, GCP, Azure (environment dependent)
- Python, Scala (preferred), Bash/scripting
- Design, develop, and maintain Apache Spark pipelines (batch and streaming) using PySpark and/or Scala
- Process and transform large‑scale datasets using modern data lake architectures (Delta Lake, Parquet)
- Optimize Spark jobs for performance, including:
- Partitioning strategies
- Shuffle optimization
- Memory tuning
- File sizing and storage efficiency
- Implement Structured Streaming pipelines for near real‑time processing
- Develop and deploy Spark applications within containerized environments (Docker) and execute workloads in Kubernetes clusters, supporting scalable and distributed processing
- Integrate Spark pipelines with downstream systems such as analytics platforms (SQL, notebooks) and AI/ML workflows
- Support data ingestion and storage in object‑based systems (e.g., S3-compatible storage)
- Troubleshoot data pipeline failures and ensure reliability in mission‑critical environments
- Operate within secure, air‑gapped environments, managing dependencies without internet access and working within controlled network and security constraints
- TS/SCI eligibility with ability/willingness to obtain and maintain a counterintelligence polygraph
- Bachelor’s degree plus 5 years’ experience in data engineering or Spark development (additional years’ experience may substitute for the degree)
- Strong hands‑on experience with Apache Spark, Python (PySpark), and data processing at scale
- Experience with Parquet and/or Delta Lake, distributed data systems, and Docker/containerization
- Basic to intermediate experience with Kubernetes and object storage systems (e.g., S3)
- Strong troubleshooting and performance‑tuning skills
- Proficiency in Bash or scripting
- Experience with Scala for Spark development
- Experience with Structured Streaming in production environments
- Familiarity with Iceberg or lakehouse architectures
- Experience with CI/CD pipelines (Jenkins, Git)
- Exposure to Terraform or Infrastructure as Code
- Experience supporting AI/ML data pipelines
- Prior experience supporting NGA, IC, or DoD programs
- Generous PTO plus 11 Federal Holidays
- 401(k) retirement planning fully vested with company match
- Tuition assistance program (annual contributions)
- Annual health and wellness allowance for fitness equipment or services
- Career development fund for education and training
- Volunteer time off: 8 hours annually for charity work
- Charitable match program
- Referral program compensation
- LOV Awards program with yearly bonus awards
Equal Opportunity Employer, including veterans and individuals with disabilities.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).