Machine Learning Infra Engineer Job San Francisco area,California USA,Software Development

Reducto helps AI teams ingest real world enterprise data with state‑of‑the‑art accuracy. The vast majority of enterprise data — from financial statements to health records — is locked in unstructured file formats like PDFs and spreadsheets. We train vision models to read those documents the way a human would, and make it possible to build products, train models, and automate processes at scale.

We’re raised over $100M from world class investors like A16z, Benchmark, and First Round Capital, and are hiring a Machine Learning Engineer to help us train and deploy the models critical to the performance of our core product.

The Opportunity

As an ML Infra Engineer, you’ll play a key role in building the inference and training frameworks that make it possible to deliver results ’ll collaborate closely with our ML and Platform teams to scale training across nodes, develop faster and more efficient serving, and create observability across the stack. This is a high‑impact role where you’ll help define what high performance ML training and inference look like at Reducto.

What

You’ll Do

Build and maintain our training and inference stack with an emphasis on fast iteration and flexibility for exploring new methods and high performance inference.

Develop benchmarks for both stacks to identify bottlenecks.

Explore state‑of‑the‑art advances in training and inference and work to apply them.

Design systems for scaling model training across multi‑node, multi‑GPU environments with strong reliability and observability.

Scale distributed training and inference workloads across large GPU clusters while improving utilization, reliability, and cost efficiency.

Build the tooling, abstractions, and observability that help ML engineers move faster from experiment to production.

You’ll Thrive Here If You:

Hold yourself to a high bar for quality and precision.

Enjoy solving complex problems and building from first principles.

Have strong Python skills and a background in systems engineering.

Are comfortable with Kubernetes and distributed training frameworks.

Love getting your hands dirty with real‑world implementation challenges.

Operate well in fast‑changing, high‑growth environments.

Collaborate effectively across technical and non‑technical teams.

Take full ownership from strategy through execution.

Have experience at an early‑stage or high‑growth startup.

Have developed in open‑source training/inference stacks in a meaningful way.

Are excited to set up distributed inference across hundreds or thousands of GPUs.

Care deeply about combining technical excellence with business impact.

This is an in‑person role at our office in San Francisco. We’re an early‑stage company which means that the role requires working hard and moving quickly. Please only apply if that excites you.

Benefits

Lunch:
Receive a free lunch to eat with your teammates daily at the office.

Transportation:
Reimbursed transportation costs.

Insurance:
Generous health insurance covering medical, dental, and vision.

Health and wellness budget:
Up to $150/mo reimbursement for health and wellness spending.

Parental leave:
Customisable leave schedule.

Reducto is an Equal Opportunity Employer committed to diversity and inclusion in the workplace. All qualified applicants will receive consideration for employment without regard to sex, race, color, age, national origin, religion, physical and mental disability, genetic information, marital status, sexual orientation, gender identity/assignment, citizenship, pregnancy or maternity, protected veteran status, or any other status prohibited by applicable law.

#J-18808-Ljbffr