More jobs:
Site Reliability Engineering Manager
Job in
Riyadh, Riyadh Region, Saudi Arabia
Listed on 2026-05-17
Listing for:
Lucid Motors Middle East
Full Time
position Listed on 2026-05-17
Job specializations:
-
IT/Tech
SRE/Site Reliability, Cloud Computing
Job Description & How to Apply Below
Description
The Cloud team at Lucid Motors is currently seeking a Senior Site Reliability Engineering (SRE) Manager to lead the reliability, scalability, and operational excellence of Lucid Motors’ cloud infrastructure and production services. This role combines hands‑on technical leadership with people management, ensuring systems are highly available while developing and empowering a team of SRE engineers.
Responsibilities- SRE Leadership & Reliability Ownership
- Own the availability, performance, and reliability of cloud services deployed and operated in KSA.
- Define, implement, and track SRE best practices, including SLIs, SLOs, SLAs, and error budgets.
- Lead the architecture and governance of highly available and disaster‑resilient systems, ensuring disaster‑recovery strategies are tested and maintained.
- Drive capacity planning, auto‑scaling, and performance tuning across Kubernetes‑based platforms.
- Own monitoring, observability, and alerting using Prometheus, Grafana, and logging platforms.
- Lead incident response, impact assessment, and root‑cause analysis for complex production issues.
- Team Management, Mentorship & Growth
- Manage a team of SRE engineers, providing technical direction, career coaching, and performance feedback.
- Review and approve infrastructure code, deployment configurations, automation scripts, and SRE tooling.
- Foster a culture of ownership, learning, blameless postmortems, and continuous improvement.
- Lead hiring, onboarding, and skill development initiatives for the SRE function.
- Ensure fair, sustainable, and well‑documented on‑call rotations.
- Cloud Platforms & Automation
- Oversee production environments on Oracle Cloud Infrastructure (OCI) and AWS.
- Govern Infrastructure‑as‑Code practices using Terraform and configuration management tools.
- Lead CI/CD strategy and implementation using ArgoCD, Jenkins, Maven, Docker, and Git Lab.
- Ensure secure and reliable deployment of microservices and data pipelines on Kubernetes using Helm.
- Platform Services & Data Systems
- Collaborate closely with Product Owners, Engineering Managers, Security, and Architecture teams.
- Oversee the reliability and scaling of platform services such as Kafka, Spark, Trino, Airflow, MQTT, and microservices ecosystems.
- Ensure stable operations of No
SQL and RDBMS systems including Elastic Search, Mongo
DB, Postgre
SQL, and MySQL. - Support distributed data processing and messaging systems, addressing performance and scalability challenges.
- B.S. or M.S. degree in Computer Science, Engineering, or a related field.
- 8+ years of experience in Site Reliability Engineering, Dev Ops, or Platform Engineering.
- 2–4 years of experience managing or leading SRE/Dev Ops engineers.
- Strong hands‑on experience with OCI and AWS cloud platforms.
- Solid expertise in Kubernetes, Terraform, CI/CD pipelines, and cloud‑native architectures.
- Proficiency in Python, Go, Bash/Shell, or similar languages.
- Strong Experience with incident management, observability, and performance optimization.
- Fluent in English, with experience collaborating across regions and time zones.
- Experience scaling SRE practices across multiple teams or services.
- Familiarity with compliance, security, and regulated cloud environments.
Lucid offers a wide range of competitive benefits, including medical, dental, vision, life insurance, disability insurance, vacation, and 401(k). The successful candidate may also be eligible to participate in Lucid’s equity program and/or a discretionary annual incentive program, subject to the rules governing such programs.
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×