Senior DevOps Engineer; Kubernetes Remote
Cape Town, 7100, South Africa
Listed on 2026-01-30
-
IT/Tech
Systems Engineer, Cloud Computing, SRE/Site Reliability
🔎
Job Title:
Senior Dev Ops Engineer (Kubernetes) | MSP Staffing | Remote / Cape Town
🏢 Recruiting Company: MSP Staffing
🌍
Job Location:
Remote (Hybrid option available in Cape Town, South Africa)
đź’Ľ Job Type: Full-Time
đź’° Salary: ~R95,000 per month
đź“§ Application Method:
Send CV to ge with subject line: “Senior Dev Ops Engineer”
MSP Staffing is urgently seeking a highly experienced Senior Dev Ops Engineer to take full ownership of Kubernetes platforms across both AWS and on‑premise environments. This role is central to ensuring platform reliability, automation, and operational excellence for mission‑critical workloads. The ideal candidate brings deep, hands‑on Kubernetes operations experience and thrives in environments where stability, scalability, and automation are top priorities.
DetailedJob Description
As a Senior Dev Ops Engineer, you will be responsible for designing, running, and improving Kubernetes-based platform services across hybrid cloud ecosystems. You will implement advanced automation, manage infrastructure as code, and support production‑grade, stateful workloads s includes working with EKS clusters, on‑prem Kubernetes, Git Ops pipelines, Terraform modules, and high‑availability networking components. You’ll collaborate with SRE, security, development, and platform engineering teams to uphold SLAs, manage incidents, optimize observability, and ensure resilient infrastructure operations.
Key Responsibilities- Own and operate Kubernetes platforms across AWS EKS and on‑prem environments
- Implement and maintain Git Ops workflows (ArgoCD preferred)
- Design and manage infrastructure using advanced Terraform practices
- Support stateful workloads (databases, message queues, storage drivers) on K8s
- Build and improve automation using Ansible and additional tool chains
- Manage Linux‑based systems supporting Kubernetes clusters
- Ensure platform availability, performance, and operational reliability
- Handle incident response, root‑cause analysis, and production troubleshooting
- Implement and maintain observability tools covering logs, metrics, and traces
- Ensure secure networking, ingress, DNS, TLS, and load‑balancing configurations
- Collaborate with development and SRE teams to improve deployment pipelines
- Extensive hands‑on experience managing Kubernetes in production environments
- Proven AWS EKS operational expertise
- Advanced Terraform (modular IaC, automation, pipelines)
- Git Ops experience (ArgoCD strongly preferred)
- Strong Linux administration background
- Hands‑on experience supporting stateful workloads on Kubernetes
- Ansible experience (automation, configuration management)
- Deep operational experience (monitoring, alerts, incident response, SLAs)
- Expertise with hybrid cloud + on‑prem deployments
- Solid understanding of networking fundamentals: DNS, ingress, SSL/TLS, load balancing
- Experience with service mesh technologies (Istio, Linkerd)
- Knowledge of cloud security, IAM, and secrets management (Vault, KMS, Sealed Secrets)
- CI/CD pipeline experience (Git Hub Actions, Git Lab CI, Jenkins)
- Familiarity with distributed systems and high‑availability architecture
- Experience with storage backends (Ceph, EBS, EFS, Longhorn, Portworx)
Showcase real examples of Kubernetes platforms you’ve owned end‑to‑end—cluster deployments, IaC automation, major incident resolution, or scaling stateful workloads. Demonstrating complete lifecycle ownership is the strongest signal of senior Dev Ops capability for this role.
#J-18808-LjbffrTo Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search: