Technical Operations Center Engineer
Listed on 2026-05-16
-
IT/Tech
Systems Engineer, Cloud Computing, IT Support, Network Engineer
We are seeking a highly motivated Technical Operations (TOC) Engineer to join our 24/7 Technical Operations Center team. This role is a vital part of our live service operations, serving as a primary escalation point for the Junior TOC team and remaining critical for maintaining the high availability, performance, and reliability of our global game infrastructure.
The ideal candidate is a composed professional with deep technical expertise in incident, problem, and service request management. While you possess the interpersonal skills to lead during a crisis, your focus will be on the technical architecture, comprehensive documentation, and advanced automation of our operational workflows. You must be a strong troubleshooter with a significant bias toward automation to ensure our studios and global community enjoy an uninterrupted experience.
WhatYou Will Do
- Technical Escalation:
Act as a key escalation point for both expected and unexpected events involving production applications, systems, and cloud infrastructure. - Infrastructure Maintenance:
Maintain our global infrastructure across both on-premise and public cloud platforms. - Automation Leadership:
Propose and implement complex automation strategies to improve overall efficiency and system performance. - SRE
Collaboration:
Work closely with Site Reliability Engineering teams to onboard and implement innovative technical solutions. - Performance Standards:
Ensure uptime and performance standards are met for a seamless gaming experience. - Advanced Troubleshooting:
Diagnose and resolve high-level technical issues involving production networks and system operations. - Documentation & Audit:
Ensure all operational activities and technical configurations are thoroughly documented and remain compliant with audits. - Service Fulfillment:
Oversee the resolution of technical service requests and user‑submitted tickets, ensuring a high level of customer service.
- Experienced:
You have 3+ years in large-scale production networks or systems operations, with a strong grasp of reliability engineering principles. - Collaborative Communicator:
You are able to communicate complex concepts clearly in English, whether with technical staff or senior management. - Adaptable and Agile:
You can support a globally distributed team, quickly adapt to new tools, and embrace changes in a dynamic environment. - Continuous Learner:
You are deeply interested in learning and implementing new technologies and architectural concepts. - Customer Centric:
You always put the needs of the customer first and think about problems and requests through the lens of the end user.
- Public Cloud Providers:
Expert‑level knowledge of AWS (GCP and Azure a plus). - Operating Systems:
Advanced administration of Linux and Windows in production environments. - Virtualized Environments:
Deep experience with VMware required. Other virtualization platforms (Proxmox, KVM, Hyper‑V, and WSL) are a plus. - Infrastructure as Code (IaC):
Deep familiarity with Terraform, Ansible, Puppet, and Pulumi. - Networking:
Expertise in protocols, firewall permissions, and advanced network triage. - Passion for Full Stack:
You are passionate about learning the full stack including pursuing formal training and certifications.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).