Founding AI/ML Engineering Lead; LLM Specialist
Listed on 2026-02-16
-
IT/Tech
AI Engineer, Data Scientist, Machine Learning/ ML Engineer
Founding AI/ML Engineering Lead (LLM Specialist)
About Mongabay
Mongabay is a leading environmental news platform that reaches over 70 million people annually with trusted journalism about conservation, climate change, and environmental issues. Founded 25 years ago, we operate a global network of correspondents across 80+ countries delivering critical information to decision-makers worldwide.
About Story TransformerStory Transformer is Mongabay’s groundbreaking initiative to democratize access to environmental information. Using generative AI, we’re building a system that automatically transforms our environmental journalism into multiple languages and accessible formats for vulnerable communities in the Global South. This is a rare opportunity to work on AI for social impact at scale—potentially reaching 18+ million people in frontline communities.
Human-AI Partnership
:
Story Transformer uses AI for speed and scale, but maintains human editorial oversight at critical points. The dual-model verification system is designed to make editors more efficient by automatically flagging potential issues, not to replace their expertise or judgment.
We’re seeking an experienced AI/ML Engineering Lead with deep expertise in Large Language Models to design and implement the foundational architecture and production-ready MVP for Story Transformer’s AI core. You’ll build the initial dual-model verification system, establish prompt engineering frameworks for environmental content, and create evaluation pipelines to ensure outputs are accurate and trustworthy. This is a leading IC role requires both technical sophistication and a pragmatic approach to deploying AI in real-world, high-stakes contexts, with focus on delivering a working system in Phase 1 (6 months) that can scale in future phases.
What You’ll Build- Dual-Model Verification System
:
Design and implement a Writer/Reviewer pipeline where two LLMs cross-check translations for accuracy, semantic consistency, and completeness across the initial 5 languages (English, Spanish, Indonesian, Portuguese, French). This system is designed to surface discrepancies and risk signals for human review, not to replace editorial judgment—the goal is to make human editors more efficient by flagging potential issues automatically. - Domain-Specific Model Optimization
:
Fine-tune LLMs on Mongabay’s 60,000+ article archive to improve performance on environmental terminology and scientific concepts establishing the methodology that can scale in Phase 2. For low-resource expansion languages, focus on establishing baseline performance metrics and progressive improvement strategies rather than immediate optimization. - Non-Global Language Enhancement
:
Document requirements and develop scalable methods to progressively improve baseline performance for underserved languages like Arabic, Bengali, Malay, Malayalam, Marathi, Nepali, Swahili, Tagalog (Filipino), Tamil, and Vietnamese. This includes approaches for building contextual glossaries and identifying linguistic partnerships needed for Phase 2 expansion.
- Prompt Engineering Framework
:
Create comprehensive prompt libraries optimized for content transformation across languages, formats, and audience types - Evaluation & Quality Systems
:
Build automated evaluation frameworks to measure translation accuracy, cultural appropriateness, and semantic preservation - Multi-Modal Integration
:
Implement text-to-speech systems optimized for multiple languages, dialects, and low-bandwidth environments - Continuous Improvement Pipeline
:
Design feedback loops that capture user data and model performance to refine outputs over time
- Primary LLMs
: AWS Bedrock (Claude, Llama, Titan, etc. – with flexibility to use multiple models) - Framework
:
Python, PyTorch or Tensor Flow - Platform
:
Amazon Web Services (Bedrock, Sage Maker) - Tools
:
Langchain, Hugging Face, custom evaluation frameworks - Data
: 60,000+ articles in 6 languages, multilingual terminology databases
- 5+ years of ML/AI experience with at least 2 years focused on Large Language Models. This is a lead individual contributor role with architectural authority – prior…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).