Software Developer; Backend – Integration Job Huntsville area,Alabama USA,IT/Tech

Position: Software Developer (Backend – Integration)
** Job Family:
** Data Science & Analysis
*
* Travel Required:

** Up to 10%
*
* Clearance Required:

** Active Top Secret (TS)
Guidehouse is seeking a Software Developer to join our Technology / AI and Data team, supporting mission-critical initiatives for Defense and Security clients. In this role, you will lead the design and implementation of secure, scalable ingestion and data processing workflows that power advanced AI-driven platforms. You will architect solutions for transforming complex, high-volume data into structured outputs optimized for downstream AI/ML pipelines, while ensuring compliance with stringent federal security and regulatory standards.

Collaborating with engineers, architects, and mission stakeholders, you will deliver innovative backend capabilities that enable accurate, efficient, and reliable decision-making in support of national security objectives.
*
* What You Will Do:

*** Serves as the lead backend integration engineer responsible for architecting and implementing ingestion, preprocessing, normalization, and transformation workflows for the FBI adjudication AI platform.
* Designs ingestion frameworks supporting SF-86 forms, investigative attachments, summaries, financial/criminal records, and continuous vetting alerts using both traditional OCR and VLM/LLM-based document understanding.
* Ensures ingestion workflows comply with FedRAMP High, RMF, CJIS, and FBI ATO requirements, including logging, auditability, encryption, and secure processing of PII and sensitive investigative information.
* Collaborates with AI/ML engineers, backend API developers, cloud engineers, and security engineers to ensure ingestion outputs are optimized for RAG workflows, SEAD-4 scoring, anomaly detection, and adjudicator review.
* Data Ingestion, Parsing & ETL Architecture
* Design ingestion pipelines supporting LLMs and VLMs for OCR, document understanding, multimodal extraction, and parsing of complex investigative materials including forms, tables, handwritten elements, and embedded imagery.
* Build scalable ingestion and ETL workflows capable of processing hundreds of pages per case using OCR engines (Textract, Tesseract) and VLM-based parsing models such as Layout

LM, Qwen-VL, Donut, or LLaVA.
* Implement normalization and transformation workflows including deduplication, schema harmonization, field mapping, classification labeling, chunking, segmentation, and tokenization optimized for downstream LLM/RAG operations.
* Develop fault-tolerant ingestion systems with checkpointing, idempotency, retry frameworks, ingestion-state tracking, and structured error reporting.
* Backend Integration & System Connectivity
* Build secure, compliant integrations with FBI systems, case repositories, identity/HR systems, and continuous vetting alert sources using APIs, ETL endpoints, SFTP, and message queues.
* Develop backend microservices that assemble case packages, correlate evidence across disparate sources, and produce structured adjudication-ready datasets.
* Integrate ingestion outputs with vector databases, embedding pipelines, and LLM inference services, ensuring data is structured, enriched, and optimized for reasoning workflows.
* Ensure all integrations enforce strict authentication, authorization, validation, and data-handling policies.
* RAG / LLM Data Preparation
* Create ingestion workflows that prepare documents and extracted content for embeddings, retrieval indexing, semantic search, and long-context reasoning.
* Implement chunking, segmentation, labeling, and evidence-tagging strategies designed to maximize retrieval precision and reduce hallucination risk in LLM inference.
* Develop heuristics for filtering, prioritizing, and contextualizing extracted information to enable fact-grounded SEAD-4 scoring and memo generation.
* Support preparation of vector representations, metadata fields, and retrieval keys for large-scale evidence collections.
* Security, Compliance & Logging
* Implement secure ingestion pipelines aligned with FedRAMP High, RMF, CJIS, and FBI security requirements including encryption, access control, PII-handling rules, and secure logging.
* Apply advanced PII-safe processing techniques including automated redaction, VLM-aided sensitive field detection, classification tagging, and compliance-driven filtering.
* Ensure ingestion systems generate detailed logs, lineage metadata, provenance trails, and audit events supporting adjudication oversight and accreditation documentation.
* Collaborate with Security Engineers to ensure ingestion controls map to SSP requirements and POA&M items are remediated promptly.
* Performance Optimization & Reliability
* Optimize ingestion pipelines for parallelization, concurrency, batching, memory efficiency, and large-scale document processing throughput.
* Implement distributed ETL frameworks such as Step Functions, Airflow, Dagster, Glue, or Spark depending on workload and security constraints.
* Develop monitoring dashboards capturing ingestion throughput, VLM/LLM OCR…


Increase/decrease your Search Radius (miles)



Job Posting Language