ETL Data Engineer Job Tysons area,Virginia USA,Software Development

Description:
Hybrid 3 days onsite / 2 days remote in Mclean, VA

Our client seeks an ETL Data Engineer to build and maintain large-scale data pipelines and to design agentic AI systems that support regulatory analytics. The role spans Spark-based ETL on AWS data lake platforms and development of LLM-powered agents with secure, auditable outputs. The engineer will collaborate across teams, uphold secure development practices, and contribute to CI/CD and infrastructure-as-code while monitoring data quality and production performance.

We can facilitate w2 and corp-to-corp consultants. For our w2 consultants, we offer a great benefits package that includes Medical, Dental, and Vision benefits, 401k with company matching, and life insurance.

Rate: $70.00 to $88.00/hr. w2

Responsibilities:

Build and maintain ETL/ELT pipelines using Apache Spark, Hive, and Trino across S3-based data lakes.
Develop and optimize SQL for large-scale surveillance datasets using window functions, joins, and complex aggregations.
Engineer big data systems on EMR-on-EC2 and EMR-on-EKS and deliver solutions on analytical platforms such as Sage Maker, Domino, or Dataiku.
Participate in data quality monitoring, anomaly detection, and production incident investigation.
Develop AI agent systems using AWS Bedrock and agent frameworks such as Strands Agents SDK or Lang Chain/Lang Graph.
Design agent harnesses that combine LLM reasoning with deterministic execution including skill or RAG-based SQL generation and structured output validation.
Implement agent memory, context management, and tool integration including MCP servers, API connectors, and data catalog lookups.
Build evaluation frameworks for agent accuracy covering paraphrase robustness, routing precision, and structural consistency.
Stay informed on advances in LLM frameworks and emerging AI capabilities.
Write clean, well-tested code and contribute to CI/CD pipelines and infrastructure-as-code on AWS.
Ensure secure handling of sensitive regulatory data with auditable execution traces.
Adhere to secure development practices and technology policies.
Partner across teams, communicate at the appropriate technical level, and maintain documentation on Confluence or Wiki.
Learn from senior team members and contribute to process improvement.

Experience Requirements:

Experience building data pipelines with Apache Spark (PySpark preferred) and SQL.
Experience with SQL engines such as Hive or Trino and cloud data platforms including AWS S3, EMR, and Lambda.
Understanding of data skew, large-volume processing, and troubleshooting job failures due to resources, data quality, and scalability.
Hands-on debugging and mitigation experience.
Practical experience building LLM-powered agent systems that use tools and produce structured outputs.
Experience with agent frameworks such as Lang Chain, Lang Graph, or AWS Strands.
Knowledge of prompt engineering, RAG architectures, and context or memory management.
Experience with foundation model APIs such as Anthropic Claude, Amazon Nova, or OpenAI.
Understanding of agent memory tiers and strategies for persistence, pruning, and retrieval.
Familiarity with harness patterns including deterministic guardrails, tool routing, and verification loops.
Hands-on experience with AI development tools such as Git Hub Copilot, Q Developer, ChatGPT, or Claude.
Experience with spec-driven development for AI-assisted code generation and validation.
Ability to leverage AI pair programming for suggestions, debugging, refactoring, and automated test generation.
Experience with AWS services including S3, EMR, EMR on EKS, Lambda, Bedrock, and Step Functions.
Hands-on experience using S3 with Spark and related file format or consistency considerations.
Familiarity with AWS Bedrock guardrails, knowledge bases, and agent orchestration.
Exposure to Google Cloud Vertex AI or equivalent managed AI platforms.
Familiarity with AWS monitoring and logging tools such as Cloud Watch and Cloud Trail.
Proficiency in Python with clean, modular, and performant code and understanding of functional concepts.
Strong understanding of collections, concurrency, and memory management.
Proficiency with SQL window functions, joins, aggregations, and complex query optimization including edge cases.

Education Requirements:

Bachelor's degree in Computer Science, Data Science, Information Systems, or related discipline with at least two years of related experience, or equivalent training and work experience. Financial services experience preferred.
Demonstrated expertise in object-oriented and database technologies resulting in enterprise-quality solutions.
Knowledge of software engineering approaches including test automation, build automation, and configuration management.
Strong written and verbal technical communication skills and effective cross-team collaboration.
Ability to learn new skills rapidly and operate in a fast-paced environment.

Recruitment Transparency Notice

Eliassen Group values transparency in our recruitment practices. Please be…