Remote AI Research Engineer — Agentic Post-Training
Town of Italy, Penn Yan, Yates County, New York, 14527, USA
Listed on 2026-05-27
-
Software Development
AI Engineer, Data Scientist, Machine Learning/ ML Engineer
About The Job
As a member of the AI model team
, you will drive innovation in post‑training methodologies, with a special focus on agentic behaviors and tool use. Your work will refine pre‑trained models so that they not only deliver enhanced intelligence and domain specific capabilities, but also learn to reason, plan, and autonomously invoke external tools to solve real‑world, multi‑step tasks and applications on edge devices (i.e., smartphones).
You will work on a wide spectrum of systems, ranging from streamlined, resource‑efficient agents that run on limited hardware to complex multi‑modal architectures integrating text, images, and audio, all optimized for tool‑augmented decision making.
We expect you to have deep expertise in large language model architectures and substantial experience in post‑training for agentic workflows, including tool‑use fine‑tuning, function calling, and reinforcement learning from feedback on multi‑turn interactions. You will adopt a hands‑on, research‑driven approach to developing, testing, and implementing new post‑training algorithms that unlock goal‑directed behavior, self‑correction, and reliable tool invocation.
Responsibilities- Conduct end-to-end research and engineering initiatives to advance post‑training of agentic and tool‑use models to achieve SOTA results.
- Drive broad, cross‑cutting model improvements, including factuality, instruction adherence, tool/function use, multi‑agent coordination, and reasoning calibration.
- Design and enhance large-scale post‑training systems, including data pipelines, training workflows, evaluation frameworks, and benchmark infrastructure.
- Develop rigorous evaluation suites and diagnostic tools to assess model readiness for deployment.
- Strengthen feedback loops from real‑world product usage, incorporating both explicit and implicit user signals into post‑training.
- Collaborate with tooling, product, and training teams to improve the usefulness, reliability, and agentic capabilities of frontier models.
- Closely liaise with research, engineering and cross‑functional teams to determine which integrations are production‑ready for inclusion in major model releases.
- Degree in Computer Science, Machine Learning, or a related field; advanced degree (MS/PhD) preferred with a strong publication record in top‑tier AI conferences.
- Experience with multimodal post‑training workflows and data pipelines, particularly for agentic systems and tool use.
- Hands‑on experience applying post‑training at scale using distributed training frameworks (e.g., multi‑node GPU environments).
- Demonstrated experience improving model capabilities in areas such as reasoning, tool use, and multi‑agent coordination that achieve SOTA results.
- Proven track record of open‑source contributions related to agentic systems or tool use (code, datasets, or models) on platforms such as Git Hub or Hugging Face.
- Publications at leading AI conferences (e.g., NeurIPS, ICML, ICLR, ACL, CVPR, ECCV).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).