NOC Engineer/NOC Analyst Job Redmond area,Washington USA,IT/Tech

Position: NOC Engineer / NOC Analyst

Location

Redmond/ WA, Local onsite; 24x7 rotational shifts (including weekends and on-call support) M-Sun 5a-5p PT

Shift Requirement

24x7 rotational shifts (including weekends and on-call support)

Role/Summary

Responsible for 24x7 monitoring, incident management, and operational support of a large-scale hybrid infrastructure including servers, virtualization platforms, storage systems, network devices, and applications. Ensure high availability, performance, and reliability across all environments (Prod, DR, Non-Prod).

Must Have Skills

Technical

Skills:

Strong knowledge of:
Windows & Linux server administration (basic troubleshooting L1 and L1.5)
Storage systems: SAN/NAS, Isilon, Quantum or similar PB-scale storage
Networking fundamentals: TCP/IP, DNS, VPN, Firewalls, Load Balancers (F5) (L1 and L1.5)
Experience with monitoring tools (New Relic, Splunk Nagios, Zabbix, Dynatrace, SCOM, etc.)
Understanding of ITSM tools (Service Now preferred) for incident, change, and problem management. Rubrik backup management tool.

Operational

Skills:

Incident management and escalation handling in 24x7 environments
Strong troubleshooting and analytical skills
Ability to correlate infrastructure, network, and application issues
Strong communication and coordination skills
Ability to work under pressure in critical outage scenarios
Good documentation and reporting skills

Preferred Qualifications

Experience in large-scale enterprise or MSP environments
Exposure to cloud or hybrid environments (AWS/Azure) is a plus.

Key Responsibilities

Infrastructure Monitoring & Operations

Monitor ~1200 + servers (Windows/Linux), virtualization platforms (VMware, Nutanix), and web servers for performance and availability.
Oversee storage systems (PB-scale: Quantum, Isilon, NAS, SAN) ensuring uptime and capacity health
Monitor network infrastructure (1200+ devices) includes switches, routers, firewalls, VPN tunnels, WAPs, and ISP circuits.
Monitor and action on the incidents, requests related to the Infra and tools hosted in the environment.
Perform L1/L2 triage for alerts, incidents, and outages across infrastructure and applications
Ensure timely incident resolution, escalation, and communication as per SLAs
Correlate alerts across tools to identify root causes and reduce noise

Application & Service Monitoring

Track service health, availability, and dependencies (web, middleware, backend systems)

Capacity & Performance Management

Track utilization trends across computing, storage (multi-PB), and network
Proactively identify bottlenecks and recommend optimization

Change & Release Support

Support infrastructure and application deployments, patches, and maintenance activities

Disaster Recovery & Resilience

Support DR readiness for large-scale storage and application environments
Participate in DR drills and failover validation

Reporting & Documentation

Maintain operational dashboards, runbooks, and incident reports
Provide daily/weekly health and SLA reports

Regards

) | Office: EXT: 444

270 Davidson Ave, Suite 704, Somerset, NJ 08873, USA

#J-18808-Ljbffr

NOC Engineer​/NOC Analyst

NOC Engineer/NOC Analyst