Technical Operations Center Engineer Job Austin area,Texas USA,IT/Tech

We are seeking a highly motivated Technical Operations (TOC) Engineer to join our 24/7 Technical Operations Center team. This role is a vital part of our live service operations, serving as a primary escalation point for the Junior TOC team and remaining critical for maintaining the high availability, performance, and reliability of our global game infrastructure.

The ideal candidate is a composed professional with deep technical expertise in incident, problem, and service request management. While you possess the interpersonal skills to lead during a crisis, your focus will be on the technical architecture, comprehensive documentation, and advanced automation of our operational workflows. You must be a strong troubleshooter with a significant bias toward automation to ensure our studios and global community enjoy an uninterrupted experience.

What

You Will Do

Technical Escalation:
Act as a key escalation point for both expected and unexpected events involving production applications, systems, and cloud infrastructure.
Infrastructure Maintenance:
Maintain our global infrastructure across both on-premise and public cloud platforms.
Automation Leadership:
Propose and implement complex automation strategies to improve overall efficiency and system performance.
SRE

Collaboration:

Work closely with Site Reliability Engineering teams to onboard and implement innovative technical solutions.
Performance Standards:
Ensure uptime and performance standards are met for a seamless gaming experience.
Advanced Troubleshooting:
Diagnose and resolve high-level technical issues involving production networks and system operations.
Documentation & Audit:
Ensure all operational activities and technical configurations are thoroughly documented and remain compliant with audits.
Service Fulfillment:
Oversee the resolution of technical service requests and user‑submitted tickets, ensuring a high level of customer service.

Who You Are

Experienced:
You have 3+ years in large-scale production networks or systems operations, with a strong grasp of reliability engineering principles.
Collaborative Communicator:
You are able to communicate complex concepts clearly in English, whether with technical staff or senior management.
Adaptable and Agile:
You can support a globally distributed team, quickly adapt to new tools, and embrace changes in a dynamic environment.
Continuous Learner:
You are deeply interested in learning and implementing new technologies and architectural concepts.
Customer Centric:
You always put the needs of the customer first and think about problems and requests through the lens of the end user.

Technical Stack & Preferred Skills

Public Cloud Providers:
Expert‑level knowledge of AWS (GCP and Azure a plus).
Operating Systems:
Advanced administration of Linux and Windows in production environments.
Virtualized Environments:
Deep experience with VMware required. Other virtualization platforms (Proxmox, KVM, Hyper‑V, and WSL) are a plus.
Infrastructure as Code (IaC):
Deep familiarity with Terraform, Ansible, Puppet, and Pulumi.
Networking:
Expertise in protocols, firewall permissions, and advanced network triage.
Passion for Full Stack:
You are passionate about learning the full stack including pursuing formal training and certifications.

#J-18808-Ljbffr