Software Engineer,ML Infrastructure Job Boston area,Massachusetts USA,Software Development

Position: Staff Software Engineer, ML Infrastructure

Staff Software Engineer, ML Infrastructure

We’re a high-tech home security company that’s passionate about protecting the life you’ve built and our mission of keeping Every Home Secure. And we’ve created a culture here that cares just as deeply about the career you’re building. Ours is a no ego culture of collaboration and innovation where those seeking their next challenge can find big opportunities and make a huge impact on the lives of all those who we protect.

We don’t just want you to work here. We want you to grow and thrive here.

We’re embracing a hybrid work model that enables our teams to split their time between office and home. Hybrid for us means we expect our teams to come together in our state-of-the-art office on two core days, typically Tuesday, Wednesday, or Thursday – working together in person and choosing where they work for the remainder of the week. We all benefit from flexibility and get to use the best of both worlds to get our work done.

Why

are we hiring?

Well, we’re growing and thriving. So, we need smart, talented, and humble people who share our values to join us as we disrupt the home security space and relentlessly pursue our mission of keeping Every Home Secure.

About the Role

We’re looking for a Staff Software Engineer to join our Cloud ML team — the team that owns both the cloud-side ML infrastructure and the applied ML research that powers Simpli Safe's intelligent home security products. This is a senior individual contributor role for a distributed systems expert who wants to apply that craft to one of the most demanding problem domains in the company.

You’ll partner closely with other Staff and Principal engineers to drive architecture, mentor across the team, and set the technical direction for our ML platform. The work spans two of our most demanding workloads:
real-time computer vision inference that processes video from cameras and doorbells across our customer base, and LLM/GenAI infrastructure that will power our future generation of intelligent applications. Both are, fundamentally, distributed systems problems — high-throughput, low-latency, multi-tenant, GPU‑aware, and unforgiving of regressions.

This role is for someone who has built and operated large-scale distributed services in production — high‑QPS APIs, real‑time platforms, low‑latency serving systems — and is excited to bring that depth to ML infrastructure. Prior ML experience is a plus, not a prerequisite. If you've shipped systems that serve a lot of traffic, scale gracefully, and stay up at 3am, we want to talk to you.

What

You'll Do

Set technical direction for ML infrastructure

Drive architecture decisions for our Kubernetes-based ML platform — anchored on Ray for inference, alongside KServe, Triton, and vLLM — across real-time and batch workloads.
Lead deep technical reviews on system design, capacity planning, and reliability for the highest‑stakes ML systems at Simpli Safe.
Identify and remove the systemic bottlenecks in our ML deployment infrastructure — whether that's serving reliability, deployment friction, observability gaps, scaling, or cost.

Build and operate real‑time CV inference at scale

Own the design and evolution of cloud‑side inference systems that process live video and events from Simpli Safe devices in real time.
Drive throughput, latency, and cost improvements (batching strategies, GPU utilization, autoscaling, multi‑model serving) for production CV models.
Build the feedback loops between cloud inference, edge devices, and the data flywheel that improves model quality over time.

Stand up LLM/GenAI serving infrastructure

Help shape how Simpli Safe serves LLMs in production — model serving patterns, KV‑cache and batching strategies, evaluation pipelines, guardrails, and cost controls.
Partner with applied ML engineers to take new GenAI‑powered product features from prototype to scaled deployment.

Raise the engineering bar across Cloud ML

Mentor engineers across the team through design reviews, code reviews, pairing, and written guidance — a meaningful uplift on everyone you work with.
Establish and evangelize best practices for model lifecycle management (registry, deployment,…