Cloud Infrastructure Engineer
Listed on 2026-06-07
-
IT/Tech
Systems Engineer, Cloud Computing
Staff Cloud Infrastructure Engineer at Restate
Restate (restate.dev) is a lightweight runtime that turns AI agents, workflows, and backend services into durable processes - so teams can focus on their logic, not failure mechanics.
The role:
We're looking for a Senior to Staff-level cloud infrastructure engineer to work across all product pillars (OSS, on-prem deployments, Multi-tenant SaaS, BYOC; bring your own cloud). This means deep work in our Rust-based infrastructure layer, integrating with cloud provider APIs, building infrastructure-as-code tooling, and ensuring reliability and security 'll have significant ownership over major parts of our cloud infrastructure.
Front-row seat to the biggest infra shift in decades
Durable runtimes like Restate are becoming the next foundational infrastructure component - and increasingly a critical piece for AI applications. As systems become more agentic, long-running, integration-heavy, and failure-prone, durable execution turns reliability from a bespoke engineering tax into a default property. In this role, you're not watching that shift from the sidelines - you help build the platform that enables it.
State-of-the-art tech, built from first principles
Restate re-imagines durable execution as a lightweight self-contained stack - no database required - and ships as a single Rust binary with an optimized custom storage layer, low latency orchestration, and an analytics engine for observability.
Enterprise Traction:
Restate is already used by Fortune 500 companies, including Tier 1 banks running critical financial workflows, and also by cutting-edge AI and infra startups pushing the boundary of what "production-grade agents" mean. You'll work on problems where reliability, correctness, and operational simplicity are existential.
Work with world-class engineers:
You'll partner directly with engineers who've built and operated foundational systems at scale - creators of Apache Flink, and leaders from Meta's messaging infrastructure. You'll have the chance to work with incredibly talented individuals who care deeply about their craft.
This is a Cloud Infrastructure Engineering role spanning Restate's product offering: OSS, on-prem deployments, Multi-tenant SaaS, BYOC. The scope of the role includes but is not limited to:
- Build and operate Restate Cloud
: extend our managed multi-tenant offering, working across the infrastructure, control plane, networking, storage, and observability of Restate workloads. - Evolve our BYOC product and work with customers on operating on-prem installations
: design and build the infrastructure that runs inside customer cloud accounts. - Reliability and observability across the fleet
: SLOs, metrics, traces, logs, alerting, and runbooks. Build automation so we can scale our product offering across deployment methods. - On-call
: participate in the cloud on-call rotation. A US-based hire materially improves our timezone coverage.
We're targeting Senior-to-Staff: you've operated production SaaS or platform infrastructure before, you've seen real failure modes, and you have (strong) opinions about how to run multi-tenant systems. You have an appreciation for operating in a compliance-sensitive environment.
Must-Haves:- Strong cloud infrastructure background with deep understanding of major cloud provider architectures.
- Experience with infrastructure-as-code and cloud orchestration, particularly Kubernetes-based stateful workloads; balancing continuous delivery with safety while maintaining large-scale production systems.
- Software engineering skills in a systems language (Rust, Go, C++); willingness and ability to learn Rust on the job.
- You should be comfortable taking ownership end-to-end, from design through production operations, and thrive in early-stage startup ambiguity.
- Prior experience with Restate or durable execution specifically.
- Deep enterprise procurement/compliance navigation.
- Kubernetes operator development, experience with IaC systems like Cluster API, Crossplane or Terraform.
- You want to work primarily on the runtime core rather than cloud, BYOC, and customer-facing infra.
- You've mostly architected and reviewed, and aren't excited to be hands-on.
- You are averse to multi-cloud, Kubernetes, operating infrastructure as a shared responsibility with customers.
- We use Restate extensively: the Restate Cloud control plane is built on Restate and Type Script.
- Rust infrastructure services and Kubernetes operators.
- US-based, fully remote. East Coast is a plus as it would materially improve our on-call coverage given the team's existing geography.
- Travel: minimal - occasional team offsites, little required customer travel.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: