AI/Machine Learning Engineer - Vision Language Models/Multimodal AI; NGA
Listed on 2026-06-05
-
IT/Tech
AI Engineer, Machine Learning/ ML Engineer
Title: AI/Machine Learning Engineer – Vision Language Models / Multimodal AI (NGA)
Location: Springfield or Herndon, VA (onsite)
Clearance: TS/SCI (CI Poly preferred)
Position Type: Full-Time, Direct Hire
Pay: $175,000 to $250,000 for an SME
Company: The name of our partner organization will be disclosed during the interview process. This is not a direct role with Launch Code; it is a position through Launch Code, working with one of our partner companies.
Disclaimer: We are unable to provide work sponsorship for this role
Overview:
We’re hiring a AI/Machine Learning Engineer with strong experience in multimodal AI and large-scale model training to support advanced vision-language initiatives in a secure government environment. This role will focus on fine-tuning Vision Language Models (VLMs) on domain-specific geospatial imagery, building scalable AWS training infrastructure, and developing evaluation frameworks for image understanding and spatial reasoning. Ideal candidates will have deep experience with PyTorch, Hugging Face, distributed training, and computer vision, along with the ability to optimize and deploy multimodal models in mission-critical environments.
Huge plus for candidates who have hands-on experience taking multimodal models such as CLIP, LLaVA, Qwen-VL, or similar Vision Language Models and fine-tuning them on classified or mission-specific imagery datasets. The ideal candidate can build the AWS infrastructure needed to train and scale these models, evaluate performance improvements across real-world use cases, and deploy solutions into secure government or air-gapped environments.
Key Responsibilities:
- Design and execute fine-tuning pipelines for Vision Language Models (VLMs) using domain-specific imagery datasets
- Handle data preprocessing, training orchestration, and hyperparameter optimization for multimodal models
- Build evaluation frameworks for image understanding, visual question answering, and spatial reasoning tasks
- Develop scalable AWS-based ML infrastructure using Sage Maker and GPU-enabled EC2 for distributed training
- Create data pipelines for curating, annotating, and transforming geospatial imagery into model-ready datasets
- Partner with applied scientists and architects on model architecture improvements, LoRA/QLoRA strategies, and inference optimization
Required Qualifications:
- Active TS/SCI with CI Poly
- 5+ years of machine learning engineering experience focused on deep learning
- 1+ year of hands-on experience fine-tuning foundation models (LLMs or VLMs)
- Experience with LoRA, QLoRA, adapters, supervised fine-tuning, instruction tuning, and RLHF/DPO
- 4+ years of advanced Python development for ML workloads
- Strong PyTorch and Hugging Face experience (Transformers, PEFT, Datasets, Accelerate)
- Experience with distributed training frameworks such as Deep Speed, FSDP, or Megatron
- 3+ years working with computer vision or multimodal models
- Familiarity with vision transformer architectures (ViT, CLIP, LLaVA, etc.)
- Experience processing and augmenting image datasets at scale
- 3+ years with AWS ML infrastructure including Sage Maker, EC2 GPU environments, and S3
- Experience with ML evaluation pipelines, benchmarking, metrics, and result analysis
- Strong software engineering fundamentals including version control, testing, and CI/CD
Preferred Qualifications:
- 2+ years working with geospatial or remote sensing imagery
- Experience with EO or SAR satellite imagery
- Understanding of geospatial metadata, coordinate systems, and imagery preprocessing
- Experience with model quantization / inference optimization (vLLM, Tensor
RT, ONNX) - MLOps tooling experience (MLflow, Weights & Biases, Sage Maker Experiments)
- Familiarity with annotation tools and active learning workflows
- Containerized ML experience with Docker / ECR / ECS / EKS
- Experience supporting ATO processes and NIST 800-53 compliance
- Experience deploying in air-gapped/disconnected environments
- Familiarity with multimodal evaluation benchmarks (MMMU, MMBench, GQA)
- Publications or contributions in computer vision, multimodal AI, or VLMs
- Synthetic data generation experience for training augmentation
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).