×
Register Here to Apply for Jobs or Post Jobs. X

Site Reliability Engineer - Operations

Job in Orem, Utah County, Utah, 84057, USA
Listing for: Utah Valley University
Full Time position
Listed on 2026-06-07
Job specializations:
  • IT/Tech
    IT Support, Systems Administrator
Job Description & How to Apply Below
Position: Site Reliability Engineer I- Operations
At Utah Valley University, this role offers the opportunity to play a critical part in supporting the infrastructure that powers teaching, learning, and daily operations across a dynamic campus environment. Working closely with senior administrators, you will manage and optimize enterprise systems and applications, ensuring reliability, security, and performance m configuring servers and maintaining system health to building monitoring solutions and automating processes through CI/CD pipelines, this position allows you to apply and grow your technical expertise while making a meaningful impact on the university community.

In addition to hands-on systems and site reliability engineering work, you will collaborate across teams on complex initiatives, contribute to innovative solutions, and help drive operational excellence. With access to modern tools like Atlassian platforms and opportunities to enhance system resilience and efficiency, this role is ideal for someone who values continuous improvement, teamwork, and purpose-driven work. UVU provides a supportive environment where your contributions directly enhance user experiences and help ensure access to reliable technology for students, faculty, and staff.

* Under close supervision, epic plans and executes projects related to the three pillars of IT operations: operational processes, change, incident problem, and Ops readiness. Assists in the execution of monitoring systems and alert configurations so that Operations knows about outages before users.

* Collaborates with leadership on the creation, facilitation, and integration of documentation, including installation steps, standard operating procedures, incident runbooks, and disaster recovery documentation into a curated change/incident/problem management library. Assists Network, Application, database, and systems administrators with the enforcement of standard procedures, acts as a remote hands within a secure data center, and maintains all required supplies and tooling for the deployment of physical enterprise equipment.

* As an incident commander, participates in business-hour on-call rotation, evaluating incoming alerts for validity and dispatching the appropriate SME to resolve issues. Executes public communications in accordance with Operational standard procedures, informing stakeholders of possible service disruptions. Maintains the integrity of Runbooks.

* Perform other job-related duties as assigned.

* An associate degree and a minimum of two years of relevant experience, or an equivalent combination of education and experience totaling four years.

* Current CompTIA A+, Network+, Security+, or Linux+ certification, or an equivalent industry-recognized IT credential, required.

Knowledge

* Knowledge of Linux and Windows Operating systems, TCP/IP fundamentals, firewall management, and anti-virus software.

* Knowledge of best practices for securing operating systems, data center maintenance, and network setup.

* Knowledge of various Monitoring solutions such as Prometheus, PRTG, Site
24x7, Test Cafe, Selenium, Splunk, New Relic, Azure Monitor, and AWS Cloud Watch.

* Knowledge of storage technologies such as SAN or NAS.

* Knowledge of Azure Active Directory, Active Directory, and LDAP.

* Knowledge of load balancing, clustering, and enterprise server architecture.

* Knowledge of Relational Database principles and databases/languages such as PL/SQL, MySQL, SQL Server, Oracle, Microsoft SQL, or MS Access.

* Knowledge of the Atlassian Suite, including Jira, Confluence, Status Page, and Opsgenie.

* Knowledge of Scrum/Agile principles as applicable to a Dev Ops Team.

Skills

* Communicate effectively in normal and high-pressure situations verbally and through written mediums.

* Perform basic server, system, and application procedures such as managing user access, performing maintenance, and troubleshooting.

* Skills in troubleshooting hardware and software problems and researching technical issues.

* Experience using basic CLI tools in Windows and Linux operating systems to troubleshoot and gather information.

* Skills in customer service and interpersonal communication, both verbal and written.

* Basic scripting and programming skills in languages such as Python, JavaScript, JSON, SQL, Bash, Test Cafe, and Selenium.

* Experience with instant communication and team collaboration platforms like MS Teams, Slack, or Jitsi.

* Skills in working in an ITSM solution such as Jira, Service Now, and Asana.

Abilities

* Ability to identify, research, troubleshoot, and implement solutions for hardware and software problems.

Ability to work in a customer service, team-oriented, collaborative, Scrum/Agile environment.  

* Highly self-motivated with the ability to learn quickly and accept feedback from peers.

* Ability to learn the implementation process and maintenance procedures for new technologies, equipment, hardware, and software such as operating systems, ITSM tools, monitoring solutions, and data center management.

*…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary