Discover the Opportunity
We’re partnering with a major government entity in Abu Dhabi that is building large-scale AI infrastructure and next-generation AI platforms to support digital transformation across the public sector.
This role sits within a central AI engineering function responsible for enabling production‑grade AI systems at scale, supporting multiple engineering teams through modern platform, infrastructure, and observability capabilities.
This is a staff‑level engineering opportunity for someone who enjoys building the foundations that power high‑performance AI systems, from GPU inference infrastructure and vector databases through to deployment platforms and developer tooling.
Discover the Responsibilities- Design, build, and operate scalable AI platform infrastructure, including model serving, vector databases, embedding pipelines, and compute environments.
- Develop and maintain GPU‑based inference infrastructure to support low‑latency, high‑throughput AI workloads in production.
- Build and operate data infrastructure including ingestion pipelines, object storage, vector stores, and ETL processes.
- Design and maintain deployment platforms using containerisation, CI/CD pipelines, and infrastructure‑as‑code practices.
- Implement observability across AI systems, including telemetry, logging, tracing, alerting, and AI‑specific performance monitoring.
- Build reusable internal tooling, deployment patterns, and platform abstractions that improve engineering productivity.
- Define and enforce platform standards across reliability, scalability, security, and operational excellence.
- Partner closely with engineering teams to translate infrastructure requirements into scalable platform capabilities.
- Evaluate and implement new platform technologies to ensure long‑term scalability and operational efficiency.
10+ years of experience in platform engineering, infrastructure engineering, or backend systems engineering.
Strong experience with cloud platforms such as Azure, AWS, or GCP, including compute, networking, storage, and cost optimisation.
Deep expertise in Docker, Kubernetes, and containerised infrastructure supporting AI or high‑scale workloads.
Strong experience building and operating CI/CD pipelines and infrastructure‑as‑code environments.
Experience designing and operating production‑grade data pipelines and distributed systems.
Strong programming skills in Python, Java, Go, or similar backend technologies.
Strong understanding of Postgre
SQL and production‑scale database operations.
Hands‑on experience with observability tooling including tracing, logging, metrics, and alerting.
Experience with GPU inference infrastructure, vector databases, RAG pipelines, or AI platform environments is highly desirable.
Strong communication skills with the ability to explain technical decisions, trade-offs, and operational considerations clearly.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).