More jobs:
Senior Engineer- Alerting & Incident Management
Job in
Abu Dhabi, UAE/Dubai
Listed on 2025-12-02
Listing for:
First Abu Dhabi Bank (FAB)
Full Time, Seasonal/Temporary
position Listed on 2025-12-02
Job specializations:
-
IT/Tech
IT Support, Systems Engineer, Cybersecurity, Cloud Computing
Job Description & How to Apply Below
Senior Engineer
- Alerting & Incident Management
Join to apply for the Senior Engineer
- Alerting & Incident Management role at First Abu Dhabi Bank (FAB).
Join the UAE’s largest bank and one of the world’s largest and safest financial institutions. Our focus is to create value for our employees, customers, shareholders and communities to grow through differentiation, agility and innovation. We are looking for top talent and your success is our success. Accelerate your growth as you help us reach our goals and advance your career.
Be ready to make your mark a top company, in an exciting and dynamic industry.
- To establish and maintain an effective, intelligent, and timely alerting framework across infrastructure, application, and business services.
- To coordinate and continuously improve the incident management lifecycle with a focus on early detection, rapid response, and root cause accountability.
- To integrate observability data (logs, metrics, traces) into a unified alerting and incident response workflow.
- To reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) through automation, clear escalation paths, and operational discipline.
- Manage and continuously improve the incident response process, including triage, escalation, status communications, and resolution tracking.
- Act as the incident commander during major outages or high-severity issues, coordinating technical teams toward resolution.
- Maintain and govern on-call schedules, escalation paths, and responder playbooks.
- Integrate observability tools with incident management platforms to enable real-time, contextual alerting.
- Lead and document root cause analysis (RCA) and ensure completion of follow-up actions and preventive measures.
- Report on incident metrics and trends, identifying areas for resilience and process improvement.
- Maintain detailed documentation on alert rules, incident workflows, contact rosters, and escalation trees.
- Ensure compliance with regulatory, audit, and risk management requirements related to incident response and system availability.
- Collaborate with monitoring, logging, and APM peers to align telemetry signals with operational response.
- Work with development, infrastructure, and support teams to embed alert and incident management best practices in SDLC and change management.
- Participate in regular incident simulations and on-call readiness drills.
- Drive continuous improvement through retrospective reviews, blameless post-mortems, and incident automation.
- Core competencies required
- Strong experience with alert management platforms such as Opsgenie, Splunk On-Call, Service Now Event Management, or Victor Ops.
- Familiarity with routing rules, escalation policies, noise suppression, on-call schedules, and alert deduplication.
- Deep understanding of the end-to-end incident management process—detection, triage, escalation, communication, and closure.
- Proficient in running major incident bridges, documenting timelines, and leading post-incident reviews (PIRs/RCAs).
- Calm and assertive in high-pressure incident scenarios.
- Excellent communicator—able to coordinate with technical and business stakeholders during incidents.
- Seniority level:
Not Applicable - Employment type:
Full-time - Job function:
Engineering and Information Technology - Industries:
Banking
Position Requirements
10+ Years
work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×