×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer II

Job in Orlando, Orange County, Florida, 32885, USA
Listing for: Kastle Systems
Full Time position
Listed on 2026-06-06
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing, IT Support
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below

Overview

Join the leader in providing smarter solutions for a safer world. The property technology space is growing rapidly, and Kastle Systems is leading the way. Kastle Systems is the leader in managed security, with a track record of introducing innovative technologies to serve over 460 million square feet of real estate globally. Clients span the commercial and multifamily real estate, education, and construction industries and the customers they serve.

Delivering a world class customer experience drives everything we do, and Kastle’s mission is to be our customers’ best service provider and to ensure that their security is the most effective, efficient and convenient. Kastle's integrated security solution, including access control, video, and remote video monitoring, significantly reduces costs and improves the critically important 24x7 performance for building owners, developers and tenants.

Site

Reliability Engineer II

The SRE II sits at the intersection of software engineering and platform operations. You will own the reliability, scalability, and operational hygiene of Kastle’s core infrastructure – engineering away toil, hardening deployment pipelines, and partnering with product engineering teams to make new services production-ready from day one. This is a mid-level individual contributor role. You are expected to execute technical work independently, drive reliability improvements end-to-end, and participate meaningfully in architecture discussions.

You will carry on-call responsibilities as part of a shared rotation with a well defined escalation model and a strong blameless post-incident review culture. The team is in the middle of a meaningful platform evolution: formalizing multi-tier release pipelines (Dev to QA to Integration to UAT to Prod) with ArgoCD based approval gates, building out SLI SLO frameworks, and migrating toward full Git Ops.

You will be a hands on contributor to all of it.

Key Responsibilities
Release Engineering & Git Ops
  • Own and evolve the multi stage deployment pipeline using ArgoCD, including approval gates, promotion policies, and rollback mechanisms.
  • Maintain trunk based branching discipline and enforce release governance standards across the engineering organization.
  • Manage feature flag lifecycle – from creation and gradual rollout to deprecation – in coordination with product and QA teams.
  • Build and maintain CI CD pipelines that enable safe frequent and auditable deployments.
Infrastructure as Code & Cloud Operations
  • Provision and manage Azure infrastructure using Terraform or Open Tofu, maintaining drift free state aligned with Git Ops principles.
  • Own Kubernetes cluster operations including workload scheduling, resource optimization, RBAC, network policy, and cost governance.
  • Identify and act on infrastructure cost optimization opportunities (compute rightsizing, storage tier selection, idle resource elimination).
  • Support Crossplane or similar operator patterns for Kubernetes native infrastructure management where applicable.
Reliability & Observability
  • Define, instrument, and enforce SLIs and SLOs in partnership with product engineering teams.
  • Build and maintain observability infrastructure – metrics, logs, and distributed traces – using Prometheus, Grafana, Open Telemetry, or equivalent tooling.
  • Conduct proactive capacity planning and performance tuning across multi tenant distributed environments.
  • Establish and maintain runbooks dashboards and alerting policies that reduce cognitive overhead during incidents.
Incident Management
  • Participate in shared on call rotation covering core platform and infrastructure services; on call load is balanced across the team with structured handoff practices.
  • Lead mitigation of live production incidents with a focus on minimizing MTTR and clear stakeholder communication under pressure.
  • Facilitate blameless post incident reviews and drive preventative engineering to closure – not just documentation.
Engineering Partnership
  • Embed with product engineering teams during design and architecture phases to establish reliability scalability and security requirements before code is written.
  • Maintain clear comprehensive documentation for…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary