Site Reliability Engineer
Listed on 2026-06-09
-
IT/Tech
Systems Engineer, SRE/Site Reliability, Cloud Computing, IT Support
Site Reliability Engineer
Full-Time | Tempe, AZ
About BasataAt Basata, we’re rethinking the way healthcare practices get work done. Our AI-powered tools help automate the repetitive, time‑consuming tasks that bog teams down, so staff can spend less time with admin and more time with patients.
We’re a small, fast‑moving team that loves building smart systems and simple, intuitive experiences. If you’re someone who gets excited about early‑stage energy, designing better ways to work, and fixing what’s broken in healthcare, we’d love to connect.
About the RoleWe're looking for our first dedicated Site Reliability Engineer (SRE) to own reliability as we grow. This is a build role, not a maintenance one. Our infrastructure today is deliberate and well‑structured infrastructure‑as‑code, containerized services, clear conventions, and a defined deployment approach. We've built a solid foundation, and the next phase is designing the reliability practice, tooling, and architecture that will carry us from serving our current clinics to serving many times more.
You'll define how we do SRE here, set the standards, and have real ownership over a domain that directly determines whether clinics can trust us with their work.
Own the reliability, availability, and performance of our production platform—define our SLOs, build the observability to measure against them, and drive the work to meet them.
Establish our incident response practice end to end: triage, mitigation, resolution, and blameless postmortems that actually prevent recurrence.
Design and build the next generation of our infrastructure and deployment systems as we scale—evolving our infrastructure‑as‑code, deployment pipeline, and operational tooling.
Reduce operational toil through automation, so reliability scales faster than headcount.
Work closely with our engineers to make services more operable—better instrumentation, graceful degradation, and designs that hold up under failure. This means reading and contributing to application code, not just managing it from the outside.
Set the operational culture and engineering standards for reliability on a small, serious team—and grow the practice as the team grows.
Strong software engineering fundamentals—you write code to solve operational problems, not just configure systems. Our stack spans Java and Python on the backend with Type Script on the frontend, and you'll work across it.
Real experience running production systems: containerized services, cloud infrastructure, and infrastructure‑as‑code.
Depth in observability and incident response—you've built monitoring and alerting that catches problems early, and you've led real incidents to resolution under pressure.
The ability to pick up an unfamiliar codebase, reason about how it behaves in production, and identify failure modes—because you'll need to understand our system to keep it reliable.
Experience designing for reliability at the architecture level: capacity planning, scaling strategies, failure isolation, and safe deployment practices.
Calm, structured judgment during incidents, and the discipline to turn each one into a lasting improvement.
Comfort owning an ambiguous, greenfield mandate—you're energized by defining a practice from the ground up rather than inheriting a finished one.
Experience as an early or first SRE hire, or building a reliability function from scratch.
Experience scaling systems through stages of rapid growth.
Healthcare, regulated‑industry, or other high‑stakes‑reliability background—where downtime and data handling carry real consequences.
Drive real impact. Your work will help clinics run more smoothly and ensure patients get the care they need—faster and with fewer headaches.
Shape something meaningful. From early product decisions to UI details, you'll play a big role in crafting both the code and the overall user experience.
High ownership. Join a team where you’re trusted to lead, build, think critically, and bring ideas to life.
Work with purpose. We’re not here to throw more tech at the wall, we're solving real problems in healthcare with tools that people rely on…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).