×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer

Job in Pittsburgh, Allegheny County, Pennsylvania, 15289, USA
Listing for: Decisive Point
Full Time position
Listed on 2026-07-03
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Systems Engineer, Cloud Computing: Infrastructure & Operations, Unix/Linux
Salary/Wage Range or Industry Benchmark: 118000 - 150000 USD Yearly USD 118000.00 150000.00 YEAR
Job Description & How to Apply Below

Site Reliability Engineer

Asylon automates security operations using robots, software, and AI to help its customers protect their people, property, and assets. Founded in 2015 by three MIT aerospace engineers, Asylon brings a strong aerospace background to develop high‑quality, reliable robotics solutions for security applications.

About Us

We build field‑deployable infrastructure for automated robots to augment security teams. Our full‑stack aerial and ground robotic solution, Drone Core, provides capabilities that were previously unavailable to security organizations.

Position

Asylon is hiring a Site Reliability Engineer to join our Philadelphia team. You will be responsible for the reliability, availability, and performance of systems across cloud infrastructure, on‑prem servers in air‑gapped customer environments, and Kubernetes clusters on edge devices deployed with our robots. You will define and maintain SLOs, build observability into every layer of the stack, lead incident response, and drive automation that keeps our systems running without manual intervention.

This role sits at the intersection of infrastructure engineering and operations—you should be comfortable writing code to eliminate toil as you are triaging an outage on a remote edge device.

Due to the nature of the projects, applicants must be a U.S. Person as defined by 22 C.F.R. §120.62 (U.S. Citizens, lawful permanent residents, refugees, or asylees).

Primary Duties
  • Own the reliability of production systems across cloud, on‑prem, and edge environments—define SLOs, track error budgets, and drive improvements.
  • Build and maintain observability infrastructure—monitoring, alerting, logging, and dashboards—to provide visibility into system health at every layer.
  • Lead incident response, conduct blameless post‑mortems, and implement remediation to prevent recurrence.
  • Develop automation to reduce toil, improve deployment reliability, and enable self‑healing infrastructure.
  • Build and maintain CI/CD pipelines for service deployment, testing, and infrastructure provisioning.
  • Manage Kubernetes clusters (K3s on edge, on‑prem, and managed cloud clusters)—deployments, upgrades, and troubleshooting.
  • Manage infrastructure‑as‑code for reproducible provisioning across cloud and air‑gapped on‑prem environments.
  • Collaborate with software and robotics engineers to build reliability into systems from the design phase.
Required Skills and Experience
  • 3+ years of professional experience in SRE, Dev Ops, or infrastructure engineering.
  • Strong in a high‑level language such as Python, Go, or Bash for building automation and tooling.
  • Proficient with Kubernetes—deploying, operating, debugging, and scaling containerized workloads.
  • Experience building and operating observability stacks—Prometheus, Grafana, Loki, or similar tools.
  • Background in CI/CD pipelines for automated testing, building, and deploying services.
  • Proficient with Linux systems administration and troubleshooting.
  • Experience with infrastructure‑as‑code tools such as Open Tofu, Terraform, or Ansible.
  • Comfortable with networking fundamentals—DNS, firewalls, VPNs, and debugging connectivity issues across distributed environments.
Bonus Points
  • Experience with K3s or lightweight Kubernetes on edge—running services on resource‑constrained hardware in the field.
  • Experience working in air‑gapped or disconnected environments where systems must operate without cloud dependencies.
  • Experience with on‑call rotations and structured incident management processes.
  • Familiarity with message brokers and streaming (MQTT, NATS, Kafka, or similar) for real‑time data pipelines.
  • Experience with robotics or IoT systems, particularly managing fleets of remote devices.
  • Experience with video streaming or processing pipelines in a production environment.
  • Comfortable getting hands‑on with hardware—building robots, tinkering with a Raspberry Pi, or debugging a device on a bench.
  • Experience with Bazel or similar build systems for managing complex, multi‑language codebases.
  • Experience with capacity planning and performance engineering.
Benefits
  • Competitive salary and equity packages.
  • 401(k) and 401(k) matching.
  • Medical, dental, and vision insurance.
  • Life…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary