DevOps Engineer Job Riyadh area,Riyadh Region Saudi Arabia,IT/Tech

Get AI-powered advice on this job and more exclusive features.

The Dev Ops Engineer will play a mission-critical role owning the deployment, scalability, security, and reliability of AI systems and digital platforms. This role has a strong focus on LLM deployments, AI workloads, and cloud-native infrastructure
, ensuring that all AI and software systems operate with enterprise-grade availability, performance, and compliance.

Key Responsibilities

Design, build, and maintain CI/CD pipelines for AI models, LLM services, and software applications
Automate build, test, deployment, and environment configuration workflows to enable rapid and reliable releases

AI & LLM Deployment Operations

Deploy, operate, and scale AI systems, LLM APIs, inference workloads, and cloud-based AI services
Ensure high availability, horizontal scalability, and low-latency inference across all production environments

Infrastructure, Reliability & Cost Optimization

Monitor infrastructure performance, system health, and AI workloads using observability and monitoring tools
Optimize infrastructure for reliability, performance, and cloud cost efficiency

Security, Compliance & Governance

Implement and enforce security best practices, access controls, secrets management, and environment isolation
Ensure infrastructure and deployment processes align with national data governance, compliance, and cybersecurity standards

Cross-Functional Enablement

Collaborate closely with AI Engineers, Full-Stack Engineers, and Product teams to enable seamless, scalable deployments
Act as the primary technical owner for production reliability during mission-critical deployments

Documentation & Architecture Standards

Maintain comprehensive documentation for Dev Ops workflows, system architecture, environments, and deployment standards
Ensure operational readiness, auditability, and knowledge transfer across teams

Required Qualifications

Minimum 5 years of hands-on Dev Ops engineering experience in production environments
Mandatory:
Proven experience deploying and operating AI systems and LLM-based workloads in production
Strong hands-on expertise with Docker, Kubernetes, CI/CD platforms, and cloud services
Experience with monitoring, observability, logging, and infrastructure-as-code (e.g., Terraform, similar tools)
Strong understanding of networking, security, and cloud-native architecture principles
Excellent troubleshooting and incident response capabilities in high-availability systems

Preferred Qualifications

Experience with MLOps platforms such as MLflow, Sage Maker, Vertex AI, or similar
Proven experience scaling AI and LLM applications in high-traffic production environments
Exposure to AI model lifecycle management, retraining pipelines, and operational governance
Experience in government, regulated, or national-scale enterprise environments

KPIs & Deliverables

Uptime, reliability, and stability of AI platforms and production systems
Deployment speed, automation maturity, and release reliability
Infrastructure performance, scalability, and cost optimization efficiency
Security posture and compliance readiness across all environments
Quality, completeness, and audit readiness of Dev Ops documentation and workflows

Referrals increase your chances of interviewing at Tarjama& by 2x

Apply BELOW


Increase/decrease your Search Radius (miles)



Job Posting Language