Senior Site Reliability Engineer
Listed on 2026-05-27
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability
About the Role
At Jamf, we believe in an open, flexible culture based on respect and trust. Our track record and thriving work environment all stem from the freedom we grant ourselves to get the job done right. We take pride in helping tens of thousands of customers around the globe succeed with Apple.
LocationsThis role is offered as remote in Minneapolis, MN;
Eau Claire, WI; or Austin, TX metro areas. You may be required to work periodically at a Jamf office or collaborative work location with other Jamf employees in your area for certain events or moments that matter. We are only able to accept applications for those based in one of these locations.
As a Senior Site Reliability Engineer, you'll help us balance development velocity with the reliability our customers depend on. You'll partner with engineering teams to shape how their services are measured, lead the work to improve them, and use what you learn from production to build the automation and agentic tooling that improves reliability globally. This is a senior individual contributor role at the intersection of Engineering, Product, Customer Success and Technical Support, where you'll play a meaningful part in shaping how we practice SRE at Jamf.
Job Responsibilities- Partner with engineering teams to define service‑level objectives, error budgets, and supporting indicators for their services, and help them use those measures to inform prioritization and reliability investment.
- Investigate complex production issues end‑to‑end across application, data, infrastructure, and network layers, using AI to correlate logs, metrics, and code and to pressure‑test hypotheses before acting.
- Produce clear technical documentation, runbooks, architecture notes, post‑mortems and proofs of concept for both technical and non‑technical audiences, in a form that engineers and AI tools can re‑use.
- Identify systemic sources of toil and lead the work to eliminate them through automation, AI agents, tooling, and process change.
- Set the conditions for AI agents to do reliable work in our environment, including repository context, well‑specified tasks, integrations such as MCP servers that give AI safe access to the systems it needs, and the tests and guardrails needed for AI‑authored change to be trusted.
- Participate in team ceremonies to identify and refine work, communicate findings, and drive opportunities to collaborate.
- Drive cross‑team and cross‑department collaboration on reliability initiatives, including reviewing designs, influencing roadmaps, and mentoring engineers on SRE practices, including effective AI use in their reliability work.
- Advise senior leadership and stakeholders during critical customer escalations, translating between technical reality and business impact.
- Contribute to scaling the SRE practice itself: improving our standards, our tooling, and how we partner with product engineering teams.
- Minimum of 5 years experience in software engineering, SRE or production operations roles.
Required - Strong production troubleshooting skills across the stack. Ability to diagnose issues from first principles using the tools available (profilers, heap and thread dumps, query plans, traces, logs, metrics).
Required - Experience working within a form of the Agile development framework process.
Required - Hands‑on experience operating production services on AWS (e.g., EC2, S3, EKS, RDS/Aurora, Cloud Front).
Required - Experience utilizing observability tools (e.g., Grafana, Prometheus, Logic Monitor).
Required - Experience creating clear and concise technical documentation that is targeted at both technical and non‑technical audiences.
Required - Experience writing infrastructure as a code.
Required - Experience writing automation in a general‑purpose language (e.g., Python, Go, Java, or similar) to a production standard.
Required - Strong judgement about how to apply AI effectively across the full range of SRE work, including high‑stakes areas such as production access and sensitive data, knowing how to scope and verify work to make it safe.
Required - Hands‑on experience using agentic development tools (e.g., Claude Code, Cursor, Copilot) to…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).