Senior Site Reliability Engineer - Observability & Monitoring
Listed on 2026-06-03
-
IT/Tech
Systems Engineer, IT Support
Overview
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Apex Systems, is seeking the following. Apply via Dice today!
JobJob#: 3036349
Job Description:
Senior Site Reliability Engineer - Observability & Monitoring
Location: Plano, Texas (Onsite)
Employment Type: 12 Months Contract
Role OverviewWe are seeking an experienced Observability and Monitoring Site Reliability Engineer to help design, implement, and operationalize monitoring for an enterprise Event Management platform. This role will focus on defining observability coverage, implementing monitoring instrumentation, building operational dashboards, and improving visibility across platform components, integrations, and services. The primary tools for this role are Dynatrace and Splunk.
Key Responsibilities- Define and implement monitoring and observability coverage for the Event Management platform.
- Establish standards for metrics, logs, traces, events, synthetic checks, and platform telemetry.
- Build monitoring for IBM Cloud Pak for Watson AIOps, Netcool OMNIbus, Netcool Impact, Open Shift, Linux, Kafka-based services, and Service Now integration points.
- Design and maintain Dynatrace monitoring for applications, infrastructure, synthetic checks, and platform dependencies.
- Design and maintain Splunk searches, dashboards, alerts, log onboarding patterns, and operational views.
- Create Open Shift and Kubernetes monitoring using available platform metrics, Prometheus, and Grafana.
- Monitor Linux-based platform components, including processes, services, file systems, and resource utilization.
- Monitor Kafka-based integrations, including topic health, consumer lag, and message throughput.
- Provide end-to-end visibility for event flow from platform ingestion through downstream integration.
- Develop runbooks, troubleshooting guides, validation procedures, and operational documentation.
Technical
Skills:
- Hands-on experience with Dynatrace for infrastructure, application, synthetic, service, and dependency monitoring.
- Hands-on experience with Splunk, including Search Processing Language (SPL), dashboards, alerts, and field extraction.
- Understanding of Open Shift or Kubernetes monitoring concepts.
- Experience monitoring Linux-based services, processes, logs, file systems, and resource utilization.
- Experience defining monitoring coverage for distributed platforms and integration services.
- Experience with REST APIs, JSON, webhooks, and system-to-system integrations.
- Experience with scripting or automation using Python, shell scripting, or Power Shell.
- Ability to troubleshoot issues across application, infrastructure, platform, and integration layers.
- Strong documentation skills for runbooks, monitoring standards, and support procedures.
Preferred Qualifications
- Experience with IBM Cloud Pak for Watson AIOps.
- Experience with IBM Netcool OMNIbus, including Object Server, probes, and gateways.
- Experience with Netcool Impact, including event enrichment and policy logic.
- Experience with Prometheus and Grafana.
- Experience monitoring Kafka, including consumer lag, topic health, and broker health.
- Experience with Service Now event, incident, or integration workflows.
- Experience monitoring .NET applications and services.
- Experience with distributed tracing and Open Telemetry.
- Experience with Git, CI/CD pipelines, and monitoring-as-code or configuration-as-code.
- Familiarity with production change management and regulated enterprise environments.
Everforth Apex is a world-class IT services company that serves thousands of clients across the globe. When you join Everforth Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including Clearly Rated’s Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico.
ApplicationProcess & Benefits
Everforth Apex uses a virtual recruiter as part of the application process. Click for more…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).