×
Register Here to Apply for Jobs or Post Jobs. X

Platform Reliability Engineer

Job in Boise City, Cimarron County, Oklahoma, 73933, USA
Listing for: KeY2Moon Solutions
Full Time position
Listed on 2026-06-11
Job specializations:
  • IT/Tech
    Systems Engineer, IT Support, SRE/Site Reliability
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Location: Boise City

Build client-critical software at Ke Y2

Moon

At Ke Y2

Moon Solutions, you will work on real client problems that affect revenue, operations, and customer experience. We combine agency speed with engineering discipline, so people who join us get broad ownership and measurable impact.

Direct exposure to product, architecture, and client decision-making

A digital subscription business is scaling quickly, but release windows trigger recurring incidents and rollback-heavy weekends for the internal team.

Their current pipeline was assembled in phases and lacks guardrails. We need a pragmatic engineer who can improve reliability without freezing product delivery.

You will redesign delivery controls, observability, and incident workflows so teams can ship often without breaking production.

Engagement Stack

Terraform

Git Hub Actions

Responsibilities
  • Rework release flow using Git Hub Actions, Terraform, and Kubernetes rollout controls that match real failure patterns
  • Improve incident readiness through better service ownership, Datadog/Sentry observability, and runbook quality
  • Set practical reliability KPIs from AWS infrastructure, deployment, and error telemetry that engineering and product can track together
  • Coach client squads on operational discipline, on-call readiness, and post-incident follow-through
Requirements
  • You have improved unstable pipelines in high-pressure environments using AWS, Kubernetes, and Infrastructure as Code
  • You can define reliability controls that teams adopt because they are practical for daily delivery, not just policy‑compliant
  • You are strong at production troubleshooting across infra, application, and CI/CD layers with clear incident communication
  • You can convert repetitive outage patterns into preventive engineering backlog with measurable reliability outcomes
Nice to have
  • Experience in subscription or payment‑heavy systems where uptime directly affects revenue
  • Experience running blameless postmortems with cross‑functional technical and business teams
  • Experience mentoring product engineers in reliability fundamentals and release safety practices
Hiring process
  • Intro call with talent team (30 minutes)
  • Practical role interview focused on recent project work (60-90 minutes)
  • Final panel on collaboration, ownership, and client communication
  • #J-18808-Ljbffr
    To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
    (If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
     
     
     
    Search for further Jobs Here:
    (Try combinations for better Results! Or enter less keywords for broader Results)
    Location
    Increase/decrease your Search Radius (miles)
    0
    200
    Filters
    Education Level
    Experience Level (years)
    Posted in last:
    Salary