Kubernetes Platform Engineer
Listed on 2026-02-14
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability
This role is critical to the implementation, development, and maintenance of Kubernetes and NATS platforms. These enterprise platforms power compute and data integration across Edge sites (on-premises) and Cloud (AWS). The platforms support use cases spanning Manufacturing, Supply Chain, and Commercial activities, with a focus on delivering high-availability, reliable platforms that enable mission-critical applications, including AI workloads.
Platform TeamThe platform team is a 24/7 capability responsible for maintaining and enhancing compute and integration capabilities, especially using Kubernetes and NATS (streaming technologies) across Edge sites and Cloud environments. This team collaborates closely with application teams and stakeholders across the organization.
A Day in The Life Typically Includes- Design, build, and operate Kubernetes clusters and container platforms at scale supporting multiple environments (dev, staging, production) across Edge and Cloud (AWS)
- Implement and maintain CI/CD pipelines for automated deployment and infrastructure provisioning
- Develop infrastructure as code using tools such as Terraform, Helm, Ansible, or similar technologies
- Monitor platform health, troubleshoot infrastructure issues, and drive continuous improvement
- Collaborate with development teams to containerize applications and optimize resource utilization
- Support critical platforms including consulting, debugging, break/fix execution, and participation in on‑call rotation
- Enable self‑service capabilities for application teams on Kubernetes and NATS platforms
- Proactively seek and share knowledge, build strong networks, and drive continuous improvement through learning, experimentation, and constructive challenges
- Be willing and able to support an on‑call rotation for nights and weekends to support and respond to critical outages and incidents.
- Strong expertise with Kubernetes, including cluster setup, management, networking, storage, RBAC, and troubleshooting
- Demonstrated experience creating and maintaining infrastructure using Infrastructure‑as‑Code tools (Terraform, Helm, Cloud Formation, Ansible)
- Experience managing a Git Hub organization with Dev Ops and Infrastructure‑as‑Code practices in Git Hub Actions, Workflows, etc.
- Experience with container orchestration and Git Ops tools like ArgoCD and Rancher
- Experience administering cloud platforms (AWS, Azure), with emphasis on enterprise governance, support, automation, and development
- Proficiency in Python/GoLang for developing automation scripts, tools, and infrastructure management solutions
- Experience with data streaming technologies such as Kafka, NATS, MQTT, or similar platforms
- Experience with monitoring and logging tools such as Prometheus, Grafana, Splunk
- Experience with edge computing architectures and hybrid cloud deployments
- Certifications such as CKA (Certified Kubernetes Administrator), CKAD (Certified Kubernetes Application Developer), or cloud provider certifications (AWS Solutions Architect, etc.)
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).