Lead Technical Operations Center Engineer, Los Angeles; On-Site
Listed on 2026-03-12
-
IT/Tech
IT Support, Systems Engineer, Cloud Computing, IT Project Manager
About Us
Data Analysis Incorporated (DAI) is the controlling entity of the O’Neil family of businesses. DAI and its subsidiaries operate in diverse industries worldwide, including global equity markets, health care, financial services, digital news, and insurance. Our global footprint allows our teams to be responsive to customer needs in a timely and efficient manner. We are dedicated to using technology and innovation to bring change and growth to our businesses.
We believe in a dynamic workplace, creating engaging, informative products and services that help our customers succeed. Integrity is an essential characteristic for our firms and our associates; if this describes you, please apply!
The Technical Operations Center (TOC) Engineer, Lead is a senior-level operational leader responsible for overseeing enterprise infrastructure monitoring, incident response, and system/network reliability. This role provides both technical depth and people leadership, guiding a team of engineers in a high-availability, mission-critical environment.
The Lead serves as the primary escalation point for major incidents, drives continuous improvement in observability and monitoring strategy, and ensures operational excellence across infrastructure platforms. This position requires prior leadership or management experience and a strong background working with enterprise monitoring and service management tools.
Compensation and Location- $100K - 105K Base pay + 15% yearly bonus target
- 12655 Beatrice Street, Los Angeles, CA. 90066
- Schedule - Monday to Friday (Flexible hours of either 8AM to 5PM or 10AM to 7PM)
- Lead daily TOC operations, including real-time monitoring, alert triage, escalation management, and incident resolution workflows to ensure optimal system uptime.
- Serve as the senior escalation point for high-priority and enterprise-impacting incidents; coordinate response efforts, drive root cause analysis (RCA), and ensure timely stakeholder communications.
- Provide direct leadership, mentorship, and performance guidance to a team of engineers within a NOC/TOC environment.
- Collaborate with Systems, Network, Cloud, and Security Engineering teams to enhance observability, alert quality, reliability engineering practices, and automation capabilities.
- Oversee and optimize monitoring and alerting platforms, including the implementation of detection logic, dashboards, runbooks, and automation scripts.
- Utilize and support IT Service Management (ITSM) processes within Jira Service Management, including incident, problem, and change management workflows.
- Participate in ITIL-aligned processes such as Major Incident Reviews (MIRs), Change Advisory Board (CAB) meetings, SLA management, and process documentation.
- Contribute to departmental planning, reporting, KPI tracking, and operational maturity initiatives. Act on behalf of the TOC Manager when required.
Required Education & Experience
- Bachelor’s degree in Computer Science, Information Systems, or related field (or equivalent professional experience).
- 7+ years of experience in IT Operations, Infrastructure Engineering, Network Administration, or Systems Engineering.
- 2+ years of leadership or management experience in a NOC, TOC, or similar operational environment (team lead, supervisor, or manager capacity required).
- Demonstrated experience leading incident response efforts in high-availability production environments.
Technical Skills & Knowledge
Monitoring & Observability Platforms- Hands-on working experience with enterprise monitoring and observability tools, including:
- Jira Service Management (incident, problem, and change workflows)
- Datadog
- AWS Cloud Watch
- Orion / Solar Winds
- Splunk
- Prometheus
- Experience with these tools does not need to have been in a prior lead role; however, the candidate must have practical, hands-on experience working with them in an operational capacity.
- Strong networking fundamentals (TCP/IP, DNS, VPNs, VLANs, firewalls, routing protocols such as BGP and OSPF).
- Experience supporting Windows and Linux server environments.
- Cloud platform experience (AWS and/or Azure).
- Familiarity with cloud-native monitoring and logging frameworks.
- Scripting and automation proficiency (Power Shell, Python, Ansible, or similar tools).
- Knowledge of ITIL frameworks and SLA-driven service delivery models.
- ITIL Foundation
- CompTIA Network+ or Security+
- Microsoft Azure Associate (AZ-104 preferred)
- Cisco CCNA or equivalent
- AWS Cloud Practitioner (or higher)
- Proven ability to lead and mentor technical teams in high-pressure operational environments.
- Strong analytical and problem-solving capabilities, particularly in incident management and systems troubleshooting.
- Excellent verbal and written communication skills, including executive-level incident reporting.
- Ability to influence cross-functional teams and drive operational improvements across departments.
- Demonstrated…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).