ML Ops Engineer — Agentic AI Lab; Founding Team Job San Francisco area,California USA,IT/Tech

Position: ML Ops Engineer — Agentic AI Lab (Founding Team)
About the Role

ML Ops Engineer — Agentic AI Lab (Founding Team) —

Location:

San Francisco Bay Area — Type:
Full-Time — Compensation:
Competitive salary + meaningful equity (founding tier)

Backed by 8VC, we're building a world-class team to tackle one of the industry’s most critical infrastructure problems.

Our AI Lab is pioneering the future of intelligent infrastructure through open-source LLMs, agent-native pipelines, retrieval-augmented generation (RAG), and knowledge-graph-grounded models.

We’re hiring an ML Ops Engineer to be the glue between ML research and production systems — responsible for automating the model training, deployment, versioning, and observability pipelines that power our agents and AI data fabric.

You’ll work across compute orchestration, GPU infrastructure, fine-tuned model lifecycle management, model governance, and security.

Responsibilities

• Build and maintain secure, scalable, and automated pipelines for:

• LLM fine-tuning, SFT, LoRA, RLHF, DPO training

• RAG embedding pipelines with dynamic updates

• Model conversion, quantization, and inference rollout

• Manage hybrid compute infrastructure (cloud, on-prem, GPU clusters) for training and inference workloads using Kubernetes, Ray, and Terraform

• Containerize models and agents using Docker, with reproducible builds and CI/CD via Git Hub Actions or ArgoCD

• Implement and enforce model governance: versioning, metadata, lineage, reproducibility, and evaluation capture

• Create and manage evaluation and benchmarking frameworks (e.g. OpenLLM-Evals, RAGAS, Lang Smith)

• Integrate with security and access control layers (OPA, ABAC, Keycloak) to enforce model policies per tenant

• Instrument observability for model latency, token usage, performance metrics, error tracing, and drift detection

• Support deployment of agentic apps with Lang Graph, Lang Chain, and custom inference backends (e.g. vLLM, TGI, Triton)

Desired Experience

Model

Infrastructure:

• 4+ years in MLOps, ML platform engineering, or infra-focused ML roles

• Deep familiarity with model lifecycle management tools: MLflow, Weights & Biases, DVC,

• Hugging Face Hub

• Experience with large model deployments (open-source LLMs preferred): LLaMA, Mistral, Falcon, Mixtral

• Comfortable with tuning libraries (Hugging Face Trainer, Deep Speed, FSDP, QLoRA)

• Familiarity with inference serving: vLLM, TGI, Ray Serve, Triton Inference Server

Automation + Infra

• Proficient with Terraform, Helm, K8s, and container orchestration

• Experience with CI/CD for ML (e.g. Git Hub Actions + model checkpoints)

• Managed hybrid workloads across GPU cloud (Lambda, Modal, Hugging Face Inference, Sagemaker)

• Familiar with cost optimization (spot instance scaling, batch prioritization, model sharding)

Agent + Data Pipeline Support

• Familiarity with Lang Chain, Lang Graph, Llama Index or similar RAG/agent orchestration tools

• Built embedding pipelines for multi-source documents (PDF, JSON, CSV, HTML)

• Integrated with vector databases (Weaviate, Qdrant, FAISS, Chroma)

Security & Governance

• Implemented model-level RBAC, usage tracking, audit trails

• Integrated with API rate limits, tenant billing, and SLA observability

• Experience with policy-as-code systems (OPA, Rego) and access layers

Preferred Stack

• LLM Ops:
Hugging Face, Deep Speed, MLflow, Weights & Biases, DVC

• Infra:
Kubernetes (GKE/EKS), Ray, Terraform, Helm, Git Hub Actions, ArgoCD

• Serving: vLLM, TGI, Triton, Ray Serve

• Pipelines:
Prefect, Airflow, Dagster

• Monitoring:
Prometheus, Grafana, Open Telemetry, Lang Smith

• Security: OPA (Rego), Keycloak, Vault

•

Languages:

Python (primary), Bash, optionally Rust or Go for tooling

Mindset & Culture Fit

• Builder's mindset with startup autonomy: you automate what slows you down

• Obsessive about reproducibility, observability, and traceability

• Comfortable with a hybrid team of AI researchers, Dev Ops, and backend engineers

• Interested in aligning ML systems to product delivery, not just papers

• Bonus: experience with SOC2, HIPAA, or Gov Cloud-grade model operations

What We’re Looking For

Experience:

• 5+ years as a full stack or backend engineer

• Experience owning and delivering production systems end-to-end

• Prior…


Increase/decrease your Search Radius (miles)



Job Posting Language