More jobs:
Site Reliability Engineering SRE Consultant Splunk/Instana/AppDynamics
Job in
Riyadh, Riyadh Region, Saudi Arabia
Listed on 2026-06-20
Listing for:
EJADA
Full Time
position Listed on 2026-06-20
Job specializations:
-
IT/Tech
SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Systems Engineer, IT Support
Job Description & How to Apply Below
The SRE Consultant – Observability & APM is responsible for designing, implementing, and optimizing large-scale observability and application performance monitoring platforms to ensure the reliability, performance, scalability, and availability of mission-critical enterprise systems. The role applies Site Reliability Engineering (SRE) principles across logging, monitoring, APM, and observability domains, acting as a subject matter expert for platforms such as Splunk, Instana, and App Dynamics, while driving automation, performance engineering, and operational excellence across hybrid and cloud-native environments.
KeyAccountabilities
- Architect, deploy, and operate enterprise-grade observability and APM platforms, including Splunk, Instana, and/or App Dynamics, across on-premises, cloud, and hybrid environments.
- Apply SRE principles by defining and managing SLIs, SLOs, and error budgets to ensure platform reliability and service performance.
- Lead performance analysis, troubleshooting, and root cause analysis (RCA) for complex application and platform-level issues.
- Design and maintain dashboards, alerts, health rules, and analytics use cases to provide end-to-end system visibility.
- Perform capacity planning, performance tuning, and scalability assessments for observability and APM platforms.
- Drive automation initiatives using scripting and Infrastructure as Code (IaC) to improve reliability, consistency, and operational efficiency.
- Integrate observability platforms with ITSM, CI/CD pipelines, SIEM, and incident management tools.
- Provide technical leadership, guidance, and mentorship to SRE, Dev Ops, and operations teams.
- Advise engineering and leadership teams on observability best practices and platform strategy.
- Maintain platform documentation, standards, and operational runbooks.
- Bachelor’s degree in computer science, Information Technology, or a related field.
- 6+ years of experience in SRE, IT Operations, Dev Ops, or application performance/observability roles.
- Strong foundation in Site Reliability Engineering (SRE), observability, and modern application architectures.
- Proven hands‑on experience with at least one of the following platforms:
Splunk, Instana, or App Dynamics, in large-scale enterprise environments. - Deep hands‑on expertise in observability, logging, and APM platforms (Splunk, Instana, App Dynamics).
- Strong understanding of APM, metrics, logs, traces, and performance engineering concepts.
- Proficiency in SRE practices, including reliability measurement, automation, and incident management.
- Experience with cloud platforms (AWS, Azure, GCP) and container orchestration technologies (Kubernetes / Open Shift).
- Strong automation and scripting skills (e.g., Python, Bash, Power Shell).
- Experience with Infrastructure as Code tools (e.g., Terraform, Ansible, Puppet) is highly desirable.
- Solid knowledge of Linux/Unix and Windows operating systems, networking, and system performance.
- Ability to communicate complex technical concepts clearly to both technical and non-technical stakeholders.
- Strong analytical, troubleshooting, and problem-solving skills.
- Relevant platform or cloud certifications (e.g., Splunk Architect, Instana, App Dynamics, Cloud/SRE certifications) are a plus.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×