Infrastructure & SRE Engineer — Secure AI Platform, Equity
Listed on 2026-05-27
-
IT/Tech
SRE/Site Reliability, Systems Engineer, Cloud Computing, Network Engineer
Salary Range: $180K to $250K base plus equity Company Overview
The company is small, technical, and operating at a high‑ownership stage. They are already seeing strong enterprise demand, including regulated and defense‑adjacent use cases, and are now hiring foundational infrastructure engineers who can help scale the platform.
This is a strong fit for engineers who want to work close to the metal on Kubernetes, containers, networking, cloud infrastructure, secure execution environments, observability, and distributed systems.
Role 1:Software Engineer, Infrastructure
The Infrastructure role is focused on building the core systems that power secure AI agent execution. This person will work on the platform layer that allows agents to run workloads safely, quickly, and reliably across cloud environments.
This role is a fit for someone who enjoys building foundational infrastructure, not just maintaining it. The ideal candidate has strong hands‑on experience with Kubernetes, Docker, Linux, networking, AWS or GCP, Terraform or Pulumi, and distributed systems.
What you will work on- Build and scale secure infrastructure for AI agent workloads
- Design and operate sandboxed execution environments, containerized systems, and distributed job orchestration
- Improve performance across the platform, with a constant focus on speed, reliability, and efficiency
- Build secure VPC deployments for enterprise and regulated customers
- Work on infrastructure involving Kubernetes, Docker, Docker‑in‑Docker, micro
VMs, Terraform, Pulumi, AWS, GCP, Grafana, and Prometheus - Debug complex production issues across containers, networking, Linux systems, cloud primitives, and distributed services
- Own systems from design through production deployment
- Strong production experience with Kubernetes, Docker, cloud infrastructure, and distributed systems
- Deep knowledge in at least one infrastructure layer such as containers, networking, Linux, storage, or cloud primitives
- Experience building infrastructure systems from scratch
- Strong debugging ability below the surface of managed cloud tooling
- Background from a strong infrastructure‑heavy company or top engineering environment
- Comfortable working directly with founders in a small, fast‑moving startup
Site Reliability Engineer
The SRE role is focused on keeping our client’s production infrastructure reliable, observable, secure, and scalable as customer demand grows. This person will own reliability practices, monitoring, alerting, incident response, deployment safety, and automation.
This role is a fit for someone who has operated production systems at scale and can improve reliability without adding unnecessary process. The ideal candidate has hands‑on experience with Kubernetes, Terraform or Pulumi, observability, incident response, SLOs, cloud infrastructure, and automation.
What you will work on- Own production reliability across our client’s infrastructure platform
- Build and improve monitoring, alerting, dashboards, and observability workflows
- Lead incident response, root cause analysis, and postmortems
- Automate deployments, scaling, provisioning, and recovery tasks
- Improve developer experience through safer releases and better operational tooling
- Work with Grafana, Prometheus, Terraform, Pulumi, Docker, Kubernetes, Python or Go, AWS, GCP, Azure, and Pager Duty‑style workflows
- Help keep infrastructure highly available, secure, and ready for enterprise customers
- 3+ years of explicit SRE, production infrastructure, or platform reliability experience
- Strong hands‑on experience with Kubernetes, Docker, Terraform or Pulumi, Grafana, and Prometheus
- Experience with incident response, on‑call, SLOs, SLIs, alerting, and production debugging
- Ability to automate reliability work with Python, Go, Bash, or infrastructure tooling
- Experience scaling infrastructure, not just maintaining it
- Background from a strong engineering company or infrastructure‑heavy environment
Our client is prioritizing candidates with strong recent full‑time experience at respected infrastructure or engineering companies. Target backgrounds include companies such as:
Google, Meta, AWS,…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).