×
Register Here to Apply for Jobs or Post Jobs. X

Machine Learning Engineer

Job in Mountain View, Santa Clara County, California, 94039, USA
Listing for: GMI Cloud
Full Time position
Listed on 2026-06-02
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 125000 - 150000 USD Yearly USD 125000.00 150000.00 YEAR
Job Description & How to Apply Below

Location: Bay area (frequent customer interaction)

Team: Inference & Reinforcement Learning Platform

About the Role

We’re looking for a Machine Learning Engineer (MLE) to work directly with customers and partners to design, deploy, and validate inference and reinforcement learning (RL) proof-of-concepts on GMI’s GPU infrastructure.

This is a high-impact, hybrid engineering role that sits at the intersection of platform engineering, applied ML, and customer success. You’ll be embedded with customers during early-stage deployments—turning research ideas, datasets, and business requirements into working, performant systems on real GPU clusters.

If you enjoy being close to users, debugging real systems, and shipping results fast (not just writing docs), this role is for you.

What You’ll Do Own customer POCs end-to‑end
  • Deploy and optimize LLM inference
    , RL training
    , and post‑training workflows on GMI clusters
  • Translate customer requirements into concrete system designs and experiments
Forward‑deploy with customers
  • Work hands‑on with research teams, startups, and enterprise customers
  • Debug performance, stability, and correctness issues in real environments
  • Stand up and tune inference stacks (e.g. vLLM / SGLang / Ray Serve‑style architectures)
  • Optimize latency, throughput, GPU utilization, and cost efficiency
RL & post‑training POCs
  • Support RLHF / RFT / SFT workflows using customer‑provided datasets
  • Integrate SDKs, training APIs, and cluster resources to shorten idea -> experiment cycles
Performance & reliability
  • Diagnose GPU, networking, and distributed system bottlenecks
  • Run benchmarks, profiling, and stress tests on multi‑GPU / multi‑node setups
Feedback loop to product
  • Feed real‑world customer learnings back into GMI’s platform, SDKs, and APIs
  • Help shape reference architectures, cookbooks, and best practices
What We’re Looking For Core Requirements
  • Strong software engineering background (Python required; Go / Rust a plus)
  • Hands‑on experience with ML inference or training systems
  • Familiarity with distributed systems and GPUs (multi‑GPU, multi‑node)
Nice to Have
  • Experience with:
  • RL or post‑training workflows (RLHF, RFT, SFT)
  • PyTorch, Deep Speed, Megatron‑LM, or similar
  • GPU performance profiling and optimization
  • Prior experience as:
  • Solutions Engineer
  • Applied Research Engineer
What Makes This Role Special
  • You’re close to real users and real GPUs
    —not abstract roadmaps
  • You’ll work on cutting‑edge inference and RL workloads
    , not toy demos
  • You’ll influence product direction through direct customer feedback
  • Fast iteration, high ownership, and visible impact
Who Thrives Here
  • Engineers who like shipping over theorizing
  • People who enjoy being the "last mile" problem solver
  • Builders who want exposure to both deep systems and applied ML
  • Those excited by early‑stage POCs that turn into real production systems
#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary