×
Register Here to Apply for Jobs or Post Jobs. X

Founding Engineer – Full Stack ML DevTools & Systems

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: David Joseph & Company
Full Time position
Listed on 2026-05-31
Job specializations:
  • Software Development
    Cloud Engineer - Software, DevOps, AI Engineer
Salary/Wage Range or Industry Benchmark: 150000 - 250000 USD Yearly USD 150000.00 250000.00 YEAR
Job Description & How to Apply Below

Founding Engineer – Full Stack ML Dev Tools & Systems

Location: San Francisco, CA
Type: Full‑Time
Base Compensation: $150,000 – $250,000
Equity: Competitive Series A Equity Package

Overview

This is a founding‑level engineering role within a Series A AI infrastructure company building core developer tools and platform primitives for post‑training, evaluation, and reinforcement learning workflows.

The platform enables ML engineers and researchers to:

Create structured training data

Evaluate model performance reliably and reproducibly at scale

This is a high‑ownership role at the center of the product. You will operate across the Python SDK, backend systems, infrastructure, and developer experience—partnering directly with frontier labs, enterprise AI teams, and AI‑native startups.

This is not a narrow feature role. You will shape foundational platform architecture and developer workflows that power advanced model training systems.

Core Responsibilities

Design and implement backend systems supporting post‑training workflows, dataset primitives, run tracking, and artifact management

Build reliable execution and orchestration systems with strong isolation and reproducibility

Improve observability, debugging capabilities, and performance across job execution and distributed data pipelines

Contribute to containerized infrastructure and Kubernetes‑based deployment patterns

Own and evolve the Python SDK with clean APIs, strong documentation, intuitive defaults, and extensibility

Design developer‑friendly abstractions for reinforcement learning, evaluation loops, and training workflows

Develop evaluation‑native workflows connecting capability measurement, data creation, training, and re‑evaluation loops

Improve CLI tools, developer interfaces, and local‑to‑cloud workflows

Work across compute, networking, storage, and IAM configurations

Design systems that are scalable, reproducible, and secure

Collaborate on distributed systems design and execution infrastructure

Partner directly with ML engineers and researchers to translate real‑world workflows into platform improvements

Incorporate structured customer feedback into roadmap decisions

Operate at the intersection of research needs and production reliability

Requirements

Strong production experience in Python

Comfort operating across the stack, including APIs, backend systems, data systems, and frontend integration

Deep understanding of Docker and Linux environments

Strong product instincts with a bias toward shipping

Demonstrated end‑to‑end ownership of production systems

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary