Site Reliability Engineer
Listed on 2026-02-16
-
IT/Tech
SRE/Site Reliability, IT Support
Description
Site Reliability Engineer (SRE)Reliability, observability, and calm leadership when it matters most.
Smiley Technologies powers core banking platforms used by banks and credit unions across the United States. When something goes wrong, it’s not just an alert - it’s real customers, real transactions, and real impact.
We’re hiring a Site Reliability Engineer to replace a former senior SRE who recently moved into a leadership role. This is a mid-level SRE position, ideal for someone who already has hands‑on SRE or Platform experience and is ready to grow into broader ownership over time. This role sits within our Platform / Dev Ops Engineering team and works across nearly every technical group in the organization.
You’ll be hands‑on with observability, incident response, CI/CD, and reliability practices - and you’ll help make life easier for engineers across Smiley.
This SRE is a key Incident Command contributor.
- Critical incidents are rare (only a few per year), but when they happen:
- You’ll run the call with clients
- Make sure the right people are present
- Keep communication clear, calm, and structured
- Capture notes and drive strong post‑incident learning
If you’re someone who stays composed under pressure and communicates clearly - this role will suit you well.
What You’ll Be Doing- Work cross‑functionally with Network, Sec Ops, Dev Sec Ops , Platform, Developers, and Support
- Own and evolve observability and monitoring, primarily using Dynatrace
- Dashboards, alerts, reporting, and adoption across teams
- Help teams improve root cause analysis, retrospectives, and reliability practices
- Support and improve CI/CD pipelines (Git Hub Actions / Azure Dev Ops)
- Maintain standards and documentation across the SDLC
- Monitor and optimize cloud costs, infrastructure tiers, and capacity
- Participate in Incident Command on‑call rotation
- Define and document SLIs, SLOs, SLAs, KPIs, and OKRs
- Promote both Shift Left and Shift Right reliability thinking
- Help Smiley continue its Dev Ops and Platform transformation
- Hybrid infrastructure (on‑prem + cloud)
- Azure‑first (AWS experience welcomed)
- Containers:
Docker, ACR, AKS - Infrastructure as Code and automation‑driven workflows
- Regulated, high‑availability financial systems
We care more about real experience + communication + teachability than checking every box.
Core Experience- 2+ years in an SRE, Platform Engineer, or Dev Ops role
- Hands‑on experience with APM / observability tools
- Dynatrace strongly preferred
- Datadog, New Relic, Prometheus/Grafana also relevant
- Experience with Azure or AWS
- Experience supporting CI/CD pipelines
- Experience with containers (Docker, AKS)
- Working knowledge of Git, Terraform, Helm, Bash, Power Shell
- Experience supporting REST APIs
- Experience with .NET or Python (or similar)
- Linux/Unix administration fundamentals
- Performance troubleshooting across:
- Applications
- Databases (SQL Server, DB2, Postgre
SQL) - Familiarity with WAFs, networking, and OWASP concepts
- Experience with developer portals (Backstage.io a plus)
- Financial services or regulated environments (helpful, not required)
- Work on core banking systems where reliability truly matters
- Real ownership without being thrown in the deep end
- Strong culture of mentorship and growing people
- A team that values communication, learning, and integrity
- Opportunity to shape what SRE looks like at Smiley
If you’re an SRE who wants meaningful responsibility, strong mentorship, and the chance to grow - we’d love to talk. Apply today and our recruiter will reach out with more details
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).