NOC Engineer Job Las Vegas area,Nevada USA,IT/Tech

The NOC Engineer is a senior operational engineering role responsible for improving the availability, stability, and reliability of enterprise IT and OT systems across a multi-affiliate, regulated environment. This role leads complex incident response, resolves cross-domain production issues, and reduces repeat incidents through advanced troubleshooting, observability, automation, and disciplined operational execution.

The Network Operations Center plays a critical role in enterprise operations and supports the continued evolution of a broader command center model for IT and OT operations. This is an opportunity to join a talented team, help strengthen monitoring and operational capabilities, and contribute to meaningful enterprise reliability work.

This role serves as a top-tier escalation point, supports advanced first- and second-level troubleshooting across Windows, Linux, networking, enterprise applications, and infrastructure platforms, and is expected to develop strong technical and operational documentation, including SOPs, runbooks, troubleshooting guides, incident reports, post-incident reviews, and operational summaries. This position may also participate in a rotational on-call schedule and may be required to provide after-hours support for major incidents, critical issues, maintenance activities, or operational escalations.

Bachelor’s degree in computer science, information technology or related field; or equivalent work experience. (Typically, four years of additional related, progressive work experience would be needed for candidates applying for this position who do not possess a bachelor’s degree. A minimum of two years additional directly related technical experience is required.)

Must have five or more years of experience.

Strong Experience and Technical Skills

Strong experience leading or supporting high-severity incidents in a production environment.

Strong hands‑on troubleshooting across Windows, Linux, networking, and enterprise infrastructure.

Solid knowledge of TCP/IP, routing, switching, VLANs, DNS, DHCP, VPNs, firewalls, and load balancing.

Experience with enterprise monitoring, alerting, and ticketing platforms.

Experience using logs, dashboards, packet captures, traces, and network / system diagnostic tools.

Experience with Python, scripting, APIs, automation, or coding‑based solutions.

Strong writing and communication skills, with the ability to create clear, accurate, and professional SOPs, runbooks, network diagrams, technical procedures, incident reports, post-incident reviews, and operational summaries.

Ability to work effectively in a 24x7 environment, remain calm during major incidents, and participate in a rotational on‑call schedule as needed.

Preferred Qualifications

Experience in regulated, multi-affiliate, or OT / industrial environments.

CCNA, CCNP, or equivalent networking knowledge.

Experience with observability platforms, metrics, logs, traces, and alert design.

Familiarity with ITIL-based incident, problem, and change management practices.

Experience with cloud, hybrid infrastructure, or configuration / automation tooling.

Experience helping build or mature a command center or enterprise operations function.

Candidates will complete a short technical simulation involving real‑world troubleshooting scenarios. The simulation may include troubleshooting scenarios across networking, Windows, Linux, incident response, automation, and operational decision‑making.

Incident Management, Escalation & Service Restoration

Lead major and critical incidents end-to-end, including restoration strategy, technical coordination, stakeholder communications, and escalation management.
Act as the senior escalation point for network outages, infrastructure failures, and service‑impacting incidents, driving timely restoration with minimal supervision.
Manage incident bridges with clear communication, accurate timelines, and disciplined coordination across infrastructure, application, security, platform, and vendor teams.
Ensure post-incident reviews are complete, actionable, and tracked through closure with clear owners, due dates, and validation steps.

Advanced Technical Troubleshooting &…