More jobs:
LLM Training & Model Development Engineer
Job in
Fort Lauderdale, Broward County, Florida, 33301, USA
Listed on 2026-06-25
Listing for:
InOpTra Digital
Apprenticeship/Internship
position Listed on 2026-06-25
Job specializations:
-
IT/Tech
Machine Learning/ ML Engineer, AI Engineer (Applied/Software), Data Engineering, Data Scientist
Job Description & How to Apply Below
Overview
Strong Data Engineer with Agentic AI experience, capable of Data Extract, Transformation, Feature Engineering, Analytics to build AI/ML models. Look for USA local candidates.
Responsibilities- Strong Data Engineer with Agentic AI experience, capable of Data Extract, Transformation, Feature Engineering, Analytics to build AI/ML models.
- Curate and preprocess training corpora for domain-specific instruction tuning.
- Fine-tune open-source LLMs using LoRA, RLHF, DPO, and model distillation techniques.
- Implement model evaluation pipelines and benchmark reporting.
- Collaborate with Prompt & Data teams to create repeatable model tuning workflows.
- Architect and implement data pipelines for large-scale text ingestion, cleaning, and transformation.
- Perform data extraction, transformation, and feature engineering across structured and unstructured sources.
- Develop and maintain data quality frameworks ensuring clean, diverse, and bias-mitigated datasets for model training.
- Automate data labeling and annotation workflows using LLM-assisted or agentic tools.
- Build domain-specific corpora for instruction tuning, conversational grounding, and retrieval-augmented training.
- Fine-tune and adapt open-source LLMs (e.g., LLaMA, Mistral, Falcon, Gemma) using LoRA, QLoRA, RLHF, DPO, and model distillation.
- Implement self-instruct and multi-turn conversational fine-tuning for agentic use cases.
- Design training orchestration scripts for distributed GPU/TPU environments (PyTorch, Deep Speed, Hugging Face Accelerate).
- Develop evaluation frameworks for automatic and human-in-the-loop assessment of LLM performance.
- Benchmark models against standard datasets (MMLU, HELM, ARC, Truthful QA) and custom internal benchmarks.
- Generate detailed performance dashboards tracking precision, hallucination rate, factual consistency, and latency.
- Conduct A/B testing and regression analysis on model updates to ensure stable improvement.
- Work cross-functionally with Prompt Engineers, Data Scientists, and Dev Ops to operationalize model development.
- Build repeatable pipelines for fine-tuning, version control, and continuous model improvement (MLOps).
- Integrate agentic feedback loops for continuous self-improvement and autonomous retraining cycles.
- Support deployment through containerized model serving (FastAPI, Triton, or Ray Serve).
- Architect and implement data pipelines for large-scale text ingestion, cleaning, and transformation.
- Perform data extraction, transformation, and feature engineering across structured and unstructured sources.
- Develop data quality frameworks ensuring clean, diverse, and bias-mitigated datasets for model training.
- Model Training & Fine-Tuning:
Fine-tune and adapt open-source LLMs (e.g., LLaMA, Mistral, Falcon, Gemma) using LoRA, QLoRA, RLHF, DPO, and model distillation; implement self-instruct and multi-turn conversational fine-tuning for agentic use cases. - Model Evaluation & Benchmarking:
Develop evaluation frameworks for automatic and human-in-the-loop assessment of LLM performance; benchmark models against standard datasets and internal benchmarks; generate performance dashboards; conduct A/B testing and regression analysis.
- Strong Python expertise with hands-on experience in PyTorch, Hugging Face Transformers, and Lang Chain.
- Deep understanding of LLM architectures, tokenizer mechanics, and parameter-efficient fine-tuning.
- Proficiency in data processing frameworks (Spark, Airflow, Pandas, Arrow, Dask).
- Experience with distributed training and GPU/TPU optimization (CUDA, NCCL).
- Knowledge of evaluation metrics and human-aligned reward modeling.
- Experience with Vector Databases (FAISS, Milvus, Pinecone) for context retrieval.
- Familiarity with cloud platforms (AWS, GCP, Azure) and container orchestration (Docker, Kubernetes).
- Exposure to agentic AI frameworks and feedback-based continuous improvement systems is a plus.
- Prior experience contributing to open-source LLM projects.
- Background in NLP research or applied ML.
- Knowledge of data privacy, ethical AI, and prompt alignment techniques.
- Master’s or Ph.D. in Computer Science, AI, or related field preferred.
- A home-grown, domain-specialized LLM trained on proprietary and public datasets.
- A scalable fine-tuning pipeline that powers multiple downstream agents and AI applications.
- An autonomous model training framework capable of learning from feedback in real time.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×