More jobs:
Software Developer; Backend – Integration
Job in
Huntsville, Madison County, Alabama, 35824, USA
Listed on 2025-12-22
Listing for:
Dovel Technologies, Inc
Full Time
position Listed on 2025-12-22
Job specializations:
-
IT/Tech
Cybersecurity, AI Engineer
Job Description & How to Apply Below
** Job Family:
** Data Science & Analysis
*
* Travel Required:
** Up to 10%
*
* Clearance Required:
** Active Top Secret (TS)
Guidehouse is seeking a Software Developer to join our Technology / AI and Data team, supporting mission-critical initiatives for Defense and Security clients. In this role, you will lead the design and implementation of secure, scalable ingestion and data processing workflows that power advanced AI-driven platforms. You will architect solutions for transforming complex, high-volume data into structured outputs optimized for downstream AI/ML pipelines, while ensuring compliance with stringent federal security and regulatory standards.
Collaborating with engineers, architects, and mission stakeholders, you will deliver innovative backend capabilities that enable accurate, efficient, and reliable decision-making in support of national security objectives.
*
* What You Will Do:
*** Serves as the lead backend integration engineer responsible for architecting and implementing ingestion, preprocessing, normalization, and transformation workflows for the FBI adjudication AI platform.
* Designs ingestion frameworks supporting SF-86 forms, investigative attachments, summaries, financial/criminal records, and continuous vetting alerts using both traditional OCR and VLM/LLM-based document understanding.
* Ensures ingestion workflows comply with FedRAMP High, RMF, CJIS, and FBI ATO requirements, including logging, auditability, encryption, and secure processing of PII and sensitive investigative information.
* Collaborates with AI/ML engineers, backend API developers, cloud engineers, and security engineers to ensure ingestion outputs are optimized for RAG workflows, SEAD-4 scoring, anomaly detection, and adjudicator review.
* Data Ingestion, Parsing & ETL Architecture
* Design ingestion pipelines supporting LLMs and VLMs for OCR, document understanding, multimodal extraction, and parsing of complex investigative materials including forms, tables, handwritten elements, and embedded imagery.
* Build scalable ingestion and ETL workflows capable of processing hundreds of pages per case using OCR engines (Textract, Tesseract) and VLM-based parsing models such as Layout
LM, Qwen-VL, Donut, or LLaVA.
* Implement normalization and transformation workflows including deduplication, schema harmonization, field mapping, classification labeling, chunking, segmentation, and tokenization optimized for downstream LLM/RAG operations.
* Develop fault-tolerant ingestion systems with checkpointing, idempotency, retry frameworks, ingestion-state tracking, and structured error reporting.
* Backend Integration & System Connectivity
* Build secure, compliant integrations with FBI systems, case repositories, identity/HR systems, and continuous vetting alert sources using APIs, ETL endpoints, SFTP, and message queues.
* Develop backend microservices that assemble case packages, correlate evidence across disparate sources, and produce structured adjudication-ready datasets.
* Integrate ingestion outputs with vector databases, embedding pipelines, and LLM inference services, ensuring data is structured, enriched, and optimized for reasoning workflows.
* Ensure all integrations enforce strict authentication, authorization, validation, and data-handling policies.
* RAG / LLM Data Preparation
* Create ingestion workflows that prepare documents and extracted content for embeddings, retrieval indexing, semantic search, and long-context reasoning.
* Implement chunking, segmentation, labeling, and evidence-tagging strategies designed to maximize retrieval precision and reduce hallucination risk in LLM inference.
* Develop heuristics for filtering, prioritizing, and contextualizing extracted information to enable fact-grounded SEAD-4 scoring and memo generation.
* Support preparation of vector representations, metadata fields, and retrieval keys for large-scale evidence collections.
* Security, Compliance & Logging
* Implement secure ingestion pipelines aligned with FedRAMP High, RMF, CJIS, and FBI security requirements including encryption, access control, PII-handling rules, and secure logging.
* Apply advanced PII-safe processing techniques including automated redaction, VLM-aided sensitive field detection, classification tagging, and compliance-driven filtering.
* Ensure ingestion systems generate detailed logs, lineage metadata, provenance trails, and audit events supporting adjudication oversight and accreditation documentation.
* Collaborate with Security Engineers to ensure ingestion controls map to SSP requirements and POA&M items are remediated promptly.
* Performance Optimization & Reliability
* Optimize ingestion pipelines for parallelization, concurrency, batching, memory efficiency, and large-scale document processing throughput.
* Implement distributed ETL frameworks such as Step Functions, Airflow, Dagster, Glue, or Spark depending on workload and security constraints.
* Develop monitoring dashboards capturing ingestion throughput, VLM/LLM OCR…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×