Senior AI/ML Engineer – Post-Training Platform Job Pasadena area,California USA,IT/Tech

We build high-performance foundation models designed to run efficiently across a wide range of environments—from edge devices to large-scale deployments. Our work spans models from ~1B to 100B+ parameters across LLMs, diffusion models, and other modalities, with a strong focus on scalable training, efficient inference, and real-world deployment.

Our Bonsai family of 1-bit and ternary models is designed to dramatically improve the efficiency of modern AI systems, enabling advanced intelligence to run with significantly lower memory usage, latency, and energy consumption across cloud and edge environments.

Role Overview

We are seeking a Senior-level (or higher) AI/ML engineer with expertise in post-training systems to contribute to the development of our post-training platform for Bonsai models. This role focuses on building scalable systems for fine-tuning, reinforcement learning, evaluation, orchestration, and model lifecycle management across cloud infrastructure and partner-hosted environments.

Responsibilities

You will design, build, and optimize platform infrastructure supporting post-training workflows for highly efficient AI models. Core responsibilities include:

Building scalable systems for fine-tuning, reinforcement learning, evaluation, and post-training workflows for Bonsai models
Developing infrastructure for data ingestion, dataset preparation, orchestration, artifact storage, logging, telemetry, and cost tracking
Supporting post-training techniques including LoRA, full fine-tuning, PPO, GRPO, DPO, and related optimization workflows
Building reliable multi-tenant infrastructure with strong isolation, access control, observability, and production reliability
Developing systems for model evaluation, benchmarking, experiment tracking, and model lifecycle management
Collaborating with model, infrastructure, and product teams to improve training efficiency, usability, and deployment readiness
Translating advances in post-training workflows and AI infrastructure into robust, production-ready platforms

Basic Qualifications

You bring experience building AI/ML infrastructure and post-training systems:

5–8+ years of experience in machine learning systems, distributed systems, infrastructure engineering, or related fields
Strong programming skills in Python and experience building production-quality AI/ML systems
Hands-on experience building infrastructure for fine-tuning, reinforcement learning, or large-scale AI workflows
Solid understanding of distributed systems, orchestration, and modern AI/ML pipelines
Experience deploying AI/ML systems in cloud or production infrastructure environments
Familiarity with observability, monitoring, and debugging production systems
Proven ability to mentor and collaborate effectively with other engineers

Preferred Qualifications

You have additional experience aligned with scalable post-training platforms and efficient AI systems:

Experience building platforms for fine-tuning, reinforcement learning, evaluation, and model management for LLMs or multimodal models
Familiarity with post-training methods such as LoRA, RLHF, PPO, DPO, GRPO, or related optimization approaches
Experience working with quantized, compressed, or low-bit models (e.g., 1-bit or ternary representations)
Familiarity with orchestration systems, multi-tenant infrastructure, API gateways, and production platform operations
Experience building developer-facing platforms including SDKs, CLIs, APIs, or self-serve tooling
Experience supporting cloud-based or partner-hosted AI workflows and deployment pipelines
Contributions to open-source AI infrastructure, tooling, or model training frameworks

Ideal Candidate Profile

You have built or significantly contributed to AI infrastructure platforms that support large-scale post-training workflows. You understand the challenges involved in fine-tuning, reinforcement learning, evaluation, and model management for modern AI systems, and you know how to build reliable, scalable infrastructure around them. You care deeply about usability, efficiency, and developer experience, enjoy solving complex systems problems, and thrive at the intersection of AI models, infrastructure, and real-world deployment.

#J-18808-Ljbffr

Senior AI​/ML Engineer – Post-Training Platform

Senior AI/ML Engineer – Post-Training Platform