More jobs:
Monitoring Engineering Production Services Specialist ll
Job in
Chandler, Maricopa County, Arizona, 85249, USA
Listed on 2026-06-26
Listing for:
The Association of Technology, Management and Applied Engineering
Full Time
position Listed on 2026-06-26
Job specializations:
-
IT/Tech
IT Support, SRE/Site Reliability, Cybersecurity
Job Description & How to Apply Below
Job Description
This job is responsible for providing support to end users and responding to issues related to incidents and problem management for multiple applications, focusing on leading triage activities on all business impacting incidents. Key responsibilities include ensuring compliance with incident management and problem management policies and procedures. Job expectations include serving as a key focal point for the customer, client, and associate experience and restoring any impacts to those experiences regardless of where the root cause of the impact lies.
Responsibilities- Leads production support triage efforts, manages bridge line troubleshooting, engages in technical research, and escalates issues to leadership as needed
- Ensures all impacts are accurately recorded and documented in the system of record, verifies documents and wikis are updated and available for use during triage, and supports on call responsibilities for incidents, the documentation of application flows, impacts during outages, the customer experience, and contacts for support needs
- Provides status updates and technical detail for awareness communications, such as infrastructure, application and client impact, and component points of failure, oversees accuracy of all communications sent, and ensures any necessary reconvenes are scheduled
- Identifies business impact, interprets monitors, dashboards, and logs, and writes queries to accurately calculate and communicate impacts to leadership in partnership with senior team members or specialists within Technology Services
- Promotes and enforces production governance during triage/testing, and identifies production failure scenarios, vulnerabilities, and opportunities for improvement, determines appropriate actions, and escalates issues as needed
- Analyzes, manages, and coordinates incident management activities to detect problems that potentially affect the service level
- Fulfills research requests, ad hoc reports, and offline incidents at the direction of senior team members or the Technology/Production Services teams
- Hands‑on experience with Splunk (search, SPL, dashboards, alerts, data onboarding, and tuning).
- Hands‑on experience with Dynatrace (APM, services/entities, alerting profiles, management zones, dashboards).
- Strong understanding of monitoring and observability concepts: logs, metrics, traces, events, and correlation.
- Experience supporting production systems and participating in incident management and operational support.
- Knowledge of SRE concepts such as reliability engineering, alert hygiene, post‑incident reviews, and automation.
- Experience working with ITSM processes (incident, problem, change) and tracking SI actions to closure.
- Basic to intermediate scripting experience (e.g., Python, Shell) for automation and analysis.
- Strong communication skills and ability to work across distributed teams in the APAC region.
- Experience with advanced Splunk or Dynatrace features (custom metrics, anomaly detection, DQL/SPL optimization, synthetic monitoring).
- Experience integrating monitoring tools with Service Now or similar ITSM platforms.
- Familiarity with capacity monitoring, performance engineering, or business transaction monitoring.
- Relevant certifications (Splunk, Dynatrace, SRE/Dev Ops, Cloud) are a plus.
- Adaptability
- Analytical Thinking
- Influence
- Production Support
- Risk Management
- Automation
- Collaboration
- Innovative Thinking
- Result Orientation
- Solution Design
- Business Acumen
- Dev Ops Practices
- Project Management
- Solution Delivery Process
- Stakeholder Management
1st shift (United States of America)
Hours per Week40
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×