Computing Control Room Operator
Listed on 2025-12-31
-
IT/Tech
IT Support
Select how often (in days) to receive an alert:
We are seeking a mid-level Computing Control Room Operator who will focus on responding to upset conditions in the data centers. This position resides in the HPC Infrastructure Operations Group in the National Center for Computational Sciences (NCCS) Division of the Computing and Computational Sciences Directorate (CCSD) at Oak Ridge National Laboratory (ORNL).
Computing Control Room Operators will perform similar duties as those of a senior operator but will not have the in-depth knowledge and experience of a senior operator. Computing Control Room Operators will be mentored/trained by senior operators. Computing Control Room Operators will work independently after initial training but will be provided contact information of senior operators and center support staff to aid them with problem solving and operational questions.
Operators fulfill a valuable role for operations in multiple data centers. The following are highlights of the work that the operators perform. The list is not all-inclusive but demonstrates ORNL’s dependency upon the operators for center performance and reduced operational risks.
Major Duties/Responsibilities:
- The operators provide 24 hour per day response to upset conditions in the data centers. Operators are trained to the Computational Science Building Computer Center Operations Emergency Response Plan and Emergency Response Checklists. Upset events include water leaks, fire, power quality events, etc. Control Room Operators are trained to recognize an event, the correct action to minimize risk or remediate the hazard, safe shutdown of equipment, who to call, when to call, and will receive senior operator/center support staff help with how to efficiently return the center to normal operations after the event.
It is also important to note that all events are not emergency events (example: loss of electrical power), but the staff still implements actions as described above. - Monitors NCCS’s system that looks at the current operational status of NCCS systems. In the event of a problem/failure, Computing Control Room Operators notify the system admin, repair under system admins directions, or turn over the issue to the system admin.
- Make hourly rounds in the data center spaces and transformer rooms. Many of these rounds have led to identification of anomalies. Correcting the anomaly, or identifying the anomaly via email or log for further investigation is expected.
- Monitors daily access into the centers and monitors work performance when performing rounds. Notifies center manager about individuals requesting access or poor work practices observed.
- Works with the senior operators, center manager and the system admins to learn the Data Center Infrastructure Management (DCIM) system. Under guidance of a senior operator, perform an annual audit of assigned racks to ensure that the in-rack system match the DCIM information.
- Monitors other ORNL specific monitoring systems for system operational status. Report identified issues to senior operators or to system admins. Attempt repair under guidance from these individuals.
- Perform duties as ORNL after hours support for the ORNL Solution center. Call the identified on-call support staff to get assistance with resolving these problems.
- Manages the incoming freight and moves freight/hardware to the appropriate location. Notifies the freight owner of the arrival of their freight. This aspect is vital to correct property management and controlling property.
- Manages the excess or EPROP of unwanted hardware. Includes processing the excess request, tagging the equipment for pick-up, maintaining extra records that have been used to identify equipment excessed that the internal systems/processes did not excess correctly. Operators also coordinate excess property shipments from the facility.
- Prints and provides all labeling for connections and equipment.
- Maintains control of all hardware in accordance with SBMS requirements for media. Inventories, stores and requests pick-up. A cradle to grave approach to media control.
- Provide NCCS/OLFC user assistance with problems such as logging in, other types of…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).