NOC Engineer
Listed on 2026-06-05
-
IT/Tech
IT Support, Systems Engineer
Position Title
NOC Engineer
LocationDes Moines, IA, United States
DescriptionThe NOC Engineer is a senior operational engineering role responsible for improving the availability, stability, and reliability of enterprise IT and OT systems across a multi‑affiliate, regulated environment. This role leads complex incident response, resolves cross‑domain production issues, and reduces repeat incidents through advanced troubleshooting, observability, automation, and disciplined operational execution.
The Network Operations Center plays a critical role in enterprise operations and supports the continued evolution of a broader command center model for IT and OT operations. This is an opportunity to join a talented team, help strengthen monitoring and operational capabilities, and contribute to meaningful enterprise reliability work. If you are a hands‑on engineer who enjoys solving difficult technical problems, improving operations, and helping build something stronger, we encourage you to apply.
This role serves as a top‑tier escalation point, supports advanced first‑ and second‑level troubleshooting across Windows, Linux, networking, enterprise applications, and infrastructure platforms, and is expected to develop strong technical and operational documentation, including SOPs, runbooks, troubleshooting guides, incident reports, post‑incident reviews, and operational summaries. This position may also participate in a rotational on‑call schedule and may be required to provide after‑hours support for major incidents, critical issues, maintenance activities, or operational escalations.
Responsibilities- Lead major and critical incidents end‑to‑end, including restoration strategy, technical coordination, stakeholder communications, and escalation management.
- Act as the senior escalation point for network outages, infrastructure failures, and service‑impacting incidents, driving timely restoration with minimal supervision.
- Manage incident bridges with clear communication, accurate timelines, and disciplined coordination across infrastructure, application, security, platform, and vendor teams.
- Ensure post‑incident reviews are complete, actionable, and tracked through closure with clear owners, due dates, and validation steps.
- Troubleshoot and restore complex production issues across Layer 2 / Layer 3 networking, servers, applications, identity services, virtualization, infrastructure platforms, and OT‑related systems.
- Perform advanced hands‑on troubleshooting across routers, switches, firewalls, Windows servers, Linux systems, VPNs, load balancers, and critical infrastructure dependencies, including work with Cisco and Juniper network products and their command‑line interfaces (CLI).
- Apply strong working knowledge of TCP/IP, routing, switching, VLANs, DNS, DHCP, VPN technologies, firewalls, and enterprise network protocols to isolate failure domains and restore service quickly and accurately.
- Use logs, metrics, dashboards, packet captures, traces, and vendor / platform command‑line tools to diagnose issues, identify root cause, restore service, and partner with engineering teams or vendors on permanent fixes.
- Work with enterprise monitoring and event management platforms to improve alert quality, service visibility, and operational awareness.
- Proactively monitor network and infrastructure health, investigate performance issues, and identify trends that may affect availability, latency, or service quality.
- Design and improve automation using APIs, scripting, coding, and operational tooling to reduce manual effort, improve consistency, and strengthen command center capabilities.
- Review new and changed services for operational readiness, including monitoring, alerting, dependencies, runbooks, support models, and escalation paths.
- Support high‑risk changes, maintenance windows, and cutovers by validating outcomes, detecting regressions, and coordinating rollback when needed.
- Develop and maintain SOPs, procedures, runbooks, network diagrams, troubleshooting guides, incident…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).