AI Platform Engineer
Listed on 2026-06-27
-
IT/Tech
Systems Engineer, Cloud Computing: Infrastructure & Operations
AI Platform Engineer (MLOps)
Airbus Defence & Space is looking for an AI Platform Engineer (MLOps) to build and scale high-performance computing infrastructure in high-security environments.
The selected candidate will join the Architecture & Integration team at CLAEX, in Torrejón de Ardoz Air Base. Their primary mission will be to design, operate, and evolve the Kubernetes platform dedicated to Artificial Intelligence, ensuring a stable, automated, and scalable working environment built on cutting-edge infrastructure (GPU).
We are looking for a specialist in Platform Engineering / MLOps who thrives on the challenges of critical infrastructure. We are not just looking for someone to maintain systems, but for a professional capable of building and automating complex computing environments in maximum-security scenarios (air-gapped or offline environments).
Key Responsibilities- AI Platform Management:
Design and administer Kubernetes environments optimized for Artificial Intelligence workloads. - Infrastructure Automation:
Implement Infrastructure as Code (IaC) methodologies and continuous deployment models (Git Ops) to ensure system reproducibility. - High-Security Environment Operations:
Manage air-gapped (isolated) infrastructures, ensuring local repository management, updates, and security without reliance on the public cloud. - Computing Resource Optimization:
Administer and prepare high-performance nodes with GPU acceleration for inference and training tasks. - Storage and Network Architecture:
Configure and maintain persistent storage systems and segmented networks to ensure data integrity and speed. - Observability and Continuity:
Implement advanced monitoring systems to ensure cluster health, GPU performance, and proactive incident detection. - System Security:
Apply hardening policies and access control to protect critical infrastructure.
- Solid experience (3+ years) in Dev Ops, SRE, or Platform Engineering roles.
- Degree in Computer, Telecomunications, Maths or Software Engineering.
- Proven experience working with container orchestration (Kubernetes).
- Experience managing critical infrastructure or isolated environments (air-gapped/offline).
- Advanced proficiency in Python.
- Advanced proficiency in Linux operating systems and network administration.
- Experience in deployment automation (Ansible, Terraform, or similar tools).
- Ability to work with modern deployment methodologies (Git Ops).
- B2 level in English.
- Military Avionics and embedded/Real Time Software knowledge is desirable. This is useful since the LLM training and inference is targeted at supporting Military Avionics development and most of the task would be related to military Avionics.
- Previous experience managing infrastructure for Artificial Intelligence (NVIDIA/CUDA driver management).
- Knowledge of distributed storage solutions and private image registry management.
- Official Kubernetes certifications (CKA/CKS).
- Interest in working on defense projects and cutting-edge technology within high-security environments.
This job requires an awareness of any potential compliance risks and a commitment to act with integrity, as the foundation for the Company's success, reputation and sustainable growth.
Company:
Airbus Defence and Space SAU
Employment Type:
Permanent
Experience Level: Professional
Job Family:
Software Engineering
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).