×
Register Here to Apply for Jobs or Post Jobs. X

Senior Machine Learning Infrastructure Engineer

Job in San Francisco, San Francisco County, California, 94199, USA
Listing for: DeepRec.ai
Full Time position
Listed on 2026-01-09
Job specializations:
  • Software Development
    AI Engineer, Data Engineer
Salary/Wage Range or Industry Benchmark: 200000 - 250000 USD Yearly USD 200000.00 250000.00 YEAR
Job Description & How to Apply Below

Senior Machine Learning Infrastructure Engineer

This range is provided by Deep Rec.ai. Your actual pay will be based on your skills and experience – talk with your recruiter to learn more.

Base pay range

$/yr - $/yr

Direct message the job poster from Deep Rec.ai

Senior Machine Learning Infra Engineer | San Francisco | Competitive Salary + Equity

Our client is an early‑stage AI company building foundation models for physics to enable end‑to‑end industrial automation, from simulation and design through optimization, validation, and production. they are assembling a small, elite, founder‑led team focused on shipping real systems into production, backed by world‑class investors and technical advisors.

They are hiring a Machine Learning Cloud Infrastructure Engineer to own the full ML infrastructure stack behind physics‑based foundation models. Working directly with the CEO and founding team, you will build, scale, and operate production‑grade ML systems used by real customers.

What you will do
  • Own distributed training and fine‑tuning infrastructure across multi‑GPU and multi‑node clusters
  • Design and operate low‑latency, highly reliable inference and model serving systems
  • Build secure fine‑tuning pipelines allowing customers to adapt models to their data and workflows
  • Deliver deployments across cloud and on‑prem environments, including enterprise and air‑gapped setups
  • Design data pipelines for large‑scale simulation and CFD datasets
  • Implement observability, monitoring, and debugging across training, serving, and data pipelines
  • Work directly with customers on deployment, integration, and scaling challenges
  • Move quickly from prototype to production infrastructure
What our client is looking for
  • 3+ years building and scaling ML infrastructure for training, fine‑tuning, serving, or deployment
  • Strong experience with AWS, GCP, or Azure
  • Hands‑on expertise with Kubernetes, Docker, and infrastructure‑as‑code
  • Experience with distributed training frameworks such as PyTorch Distributed, Deep Speed, or Ray
  • Proven experience building production‑grade inference systems
  • Strong Python skills and deep understanding of the end‑to‑end ML lifecycle
  • High execution velocity, strong debugging instincts, and comfort operating in ambiguity
Nice to have
  • Background in physics, simulation, or computer‑aided engineering software
  • Experience deploying ML systems into enterprise or regulated environments
  • Large‑scale ML data engineering and validation pipelines
  • Experience at high‑growth AI startups or leading AI research labs
  • Customer‑facing or forward‑deployed engineering experience
  • Open‑source contributions to ML infrastructure

This role suits someone who earns respect through hands‑on technical contribution, thrives in intense, execution‑driven environments, values deep focused work, and takes full ownership of outcomes. The company offers ownership of core infrastructure, direct collaboration with the CEO and founding team, work on high‑impact AI and physics problems, competitive compensation with meaningful equity, an in‑person‑first culture five days a week, strong benefits, daily meals, stipends, and immigration support.

Seniority

level

Mid‑Senior level

Employment type

Full‑time

Job function

Information Technology

Industries

Research Services

#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary