Lead Technical Operations Center Engineer,Los Angeles; On-Site Job Los Angeles area,California USA,IT/Tech

Position: Lead Technical Operations Center Engineer, Los Angeles (On-Site)

About Us

Data Analysis Incorporated (DAI) is the controlling entity of the O’Neil family of businesses. DAI and its subsidiaries operate in diverse industries worldwide, including global equity markets, health care, financial services, digital news, and insurance. Our global footprint allows our teams to be responsive to customer needs in a timely and efficient manner. We are dedicated to using technology and innovation to bring change and growth to our businesses.

We believe in a dynamic workplace, creating engaging, informative products and services that help our customers succeed. Integrity is an essential characteristic for our firms and our associates; if this describes you, please apply!

Summary

The Technical Operations Center (TOC) Engineer, Lead is a senior-level operational leader responsible for overseeing enterprise infrastructure monitoring, incident response, and system/network reliability. This role provides both technical depth and people leadership, guiding a team of engineers in a high-availability, mission-critical environment.

The Lead serves as the primary escalation point for major incidents, drives continuous improvement in observability and monitoring strategy, and ensures operational excellence across infrastructure platforms. This position requires prior leadership or management experience and a strong background working with enterprise monitoring and service management tools.

Compensation and Location

$100K - 105K Base pay + 15% yearly bonus target
12655 Beatrice Street, Los Angeles, CA. 90066
Schedule - Monday to Friday (Flexible hours of either 8AM to 5PM or 10AM to 7PM)

Duties and Responsibilities

Lead daily TOC operations, including real-time monitoring, alert triage, escalation management, and incident resolution workflows to ensure optimal system uptime.
Serve as the senior escalation point for high-priority and enterprise-impacting incidents; coordinate response efforts, drive root cause analysis (RCA), and ensure timely stakeholder communications.
Provide direct leadership, mentorship, and performance guidance to a team of engineers within a NOC/TOC environment.
Collaborate with Systems, Network, Cloud, and Security Engineering teams to enhance observability, alert quality, reliability engineering practices, and automation capabilities.
Oversee and optimize monitoring and alerting platforms, including the implementation of detection logic, dashboards, runbooks, and automation scripts.
Utilize and support IT Service Management (ITSM) processes within Jira Service Management, including incident, problem, and change management workflows.
Participate in ITIL-aligned processes such as Major Incident Reviews (MIRs), Change Advisory Board (CAB) meetings, SLA management, and process documentation.
Contribute to departmental planning, reporting, KPI tracking, and operational maturity initiatives. Act on behalf of the TOC Manager when required.

Qualifications & Requirements

Required Education & Experience

Bachelor’s degree in Computer Science, Information Systems, or related field (or equivalent professional experience).
7+ years of experience in IT Operations, Infrastructure Engineering, Network Administration, or Systems Engineering.
2+ years of leadership or management experience in a NOC, TOC, or similar operational environment (team lead, supervisor, or manager capacity required).
Demonstrated experience leading incident response efforts in high-availability production environments.

Technical Skills & Knowledge

Monitoring & Observability Platforms

Hands-on working experience with enterprise monitoring and observability tools, including:
Jira Service Management (incident, problem, and change workflows)
Datadog
AWS Cloud Watch
Orion / Solar Winds
Splunk
Prometheus
Experience with these tools does not need to have been in a prior lead role; however, the candidate must have practical, hands-on experience working with them in an operational capacity.

Infrastructure & Systems

Strong networking fundamentals (TCP/IP, DNS, VPNs, VLANs, firewalls, routing protocols such as BGP and OSPF).
Experience supporting Windows and Linux server environments.
Cloud platform experience (AWS and/or Azure).
Familiarity with cloud-native monitoring and logging frameworks.
Scripting and automation proficiency (Power Shell, Python, Ansible, or similar tools).
Knowledge of ITIL frameworks and SLA-driven service delivery models.

Preferred Certifications

ITIL Foundation
CompTIA Network+ or Security+
Microsoft Azure Associate (AZ-104 preferred)
Cisco CCNA or equivalent
AWS Cloud Practitioner (or higher)

Leadership & Professional Competencies

Proven ability to lead and mentor technical teams in high-pressure operational environments.
Strong analytical and problem-solving capabilities, particularly in incident management and systems troubleshooting.
Excellent verbal and written communication skills, including executive-level incident reporting.
Ability to influence cross-functional teams and drive operational improvements across departments.
Demonstrated…


Increase/decrease your Search Radius (miles)



Job Posting Language

Lead Technical Operations Center Engineer, Los Angeles; On-Site