More jobs:
Management & Observability Standards Lead
Job in
Fairfield, Solano County, California, 94533, USA
Listed on 2026-06-12
Listing for:
Axelon Services Corporation
Full Time
position Listed on 2026-06-12
Job specializations:
-
IT/Tech
IT Support, Cybersecurity, Systems Engineer, Security Manager
Job Description & How to Apply Below
Pay Rate: $65/hour
Duration: 6 Months
Location: Fairfield, CA Responsibilities:
- Establish and maintain a department-wide alert rationalization framework.
- Lead continuous improvement efforts to reduce alert fatigue while preserving the detection of true incidents.
- Define and enforce alerting standards including severity definitions, required metadata, naming conventions, and routing rules.
- Create a standardized Alert Design Checklist and approval workflow.
- Act as a gatekeeper for determining alert routing to 24x7 Eyes-on-Glass, on-call engineering, or business-hours handling.
- Establish a consistent approach to cataloging response instructions for actionable alerts.
- Define and publish KPIs demonstrating alerting health and operational performance.
- Facilitate governance forums with service owners and engineering leads to review alert quality and backlog.
- Coach service teams on best practices and drive adoption of observability patterns.
- Minimum 5 years in IT Operations, SRE, Observability, Monitoring Engineering, or Incident Management.
- Demonstrated success in reducing noise and improving actionability across enterprise alerting ecosystems.
- Experience with common monitoring/observability tools such as Splunk, App Dynamics, Dynatrace, Datadog, Prometheus/Grafana, Azure Monitor, Cloud Watch, or Service Now Event Management.
- Strong understanding of incident response workflows, operational coverage models, CMDB/service ownership concepts, and knowledge management.
- Excellent stakeholder management skills and ability to drive standards across teams.
- Experience designing or operating an Operations Command Center / NOC / SOC-style “eyes-on-glass” model.
- Familiarity with ITIL Event Management, SRE principles, and service reliability practices.
- Experience with automation for alert enrichment, correlation, and routing.
- Background in governance frameworks and operating rhythm design.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×