Site Reliability Engineer
Listed on 2026-07-01
-
IT/Tech
Systems Engineer, IT Support, Cloud Computing: Infrastructure & Operations
Site Reliability Engineer
Location:
Salt Lake, UT USA
Full time position
Detailed Job Description.
Well versed in Application Monitoring tools (Splunk, Extrahop, App Dynamics, Prometheus & Grafana)
Good understanding of JVM and Database metrics (CPU, Memory and Disk Space utilization, Threads, Connection counts) with hands-on experience
Good understanding of Java web services (REST Services)
Should be able to do analysis of traffic patterns, errors and exceptions from logs suggest improvement ideas
Possess strong communication and interpersonal skills to articulate with Senior Management and various stake holders such as App Dev, Infra Teams, DBAs etc.
The resource should be able to work independently to drive all the stability improvements to ensure zero downtime for our web services and Database.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).