×
Register Here to Apply for Jobs or Post Jobs. X

Cloud Operations Engineer

Job in San Jose, Santa Clara County, California, 95199, USA
Listing for: Extreme Networks, Inc.
Full Time position
Listed on 2026-02-16
Job specializations:
  • IT/Tech
    Systems Engineer, Cloud Computing
Salary/Wage Range or Industry Benchmark: 150000 - 200000 USD Yearly USD 150000.00 200000.00 YEAR
Job Description & How to Apply Below
Position: Staff Cloud Operations Engineer

Candidates for this role may be located in Portugal, Spain, Poland or Ireland

Job Teaser Summary

Extreme’s Cloud Operations team is a group of talented engineers passionate about building highly reliable, scalable and secure solutions in public/private cloud environments. We are looking to hire a highly motivated Cloud Operations engineer with strong working experience in production operation and deployment automation. You will work with the team to design, develop and implement deployment automation solutions end-to-end. You will also be expected to participate in continuous cloud service operation, troubleshoot and resolve complex issues in production.

We will work together to design, develop and implement the best public / private / local cloud solutions for our customers. Extreme Networks is the right place to be and now is the right time to join us and be part of our spectacular growth and success. We re looking for the best and the brightest  A  players who want to make a difference doing a job they love.

About

the Role

We want you to help lead infrastructure engineering for Extreme Cloud, a multi-cloud SaaS platform. Design, build, and operate large-scale, multi-region Kubernetes environments across AWS, GCP, and Azure and on-prem. Drive reliability, scalability, and operational excellence for a platform serving global customers.

What You ll Do
  • Architect & Scale Infrastructure
    :
    Design and implement multi-cluster, multi-region Kubernetes deployments using EKS, GKE, and AKS. Build infrastructure that scales across regions and cloud providers.
  • Own Production Systems
    :
    Take end-to-end ownership of production infrastructure. Drive incident response, postmortems, and improvements to prevent recurrence.
  • Infrastructure as Code at Scale
    :
    Build and maintain Terraform modules for complex infrastructure patterns. Manage thousands of configuration files across clusters, regions, and environments using Git Ops principles.
  • Git Ops & Deployment Excellence
    :
    Design and optimize ArgoCD Application Sets and Helm chart architectures. Build deployment pipelines that enable safe, automated releases across hundreds of microservices.
  • Performance & Reliability Engineering
    :
    Analyze system performance, identify bottlenecks, and implement optimizations. Improve SLOs through capacity planning, autoscaling, and architectural improvements.
  • Observability & Monitoring
    :
    Build and enhance monitoring, alerting, and observability using Prometheus, Grafana, Loki, and custom tooling. Drive visibility into complex distributed systems.
  • Security & Compliance
    :
    Implement security controls, compliance frameworks, and best practices across cloud infrastructure. Design secure multi-tenant architectures.
  • Technical Leadership
    :
    Mentor engineers, establish best practices, and drive technical decisions. Collaborate with platform, SRE, and product teams to deliver reliable infrastructure.
What We're Looking For
  • 5+ years in cloud infrastructure engineering, with deep expertise in at least one major cloud provider (AWS preferred)
  • Strong Kubernetes experience: cluster design, operators, controllers, and multi-cluster management
  • Proficiency with Infrastructure as Code:
    Terraform, Cloud Formation, or similar
  • Git Ops expertise:
    ArgoCD, Flux, or similar; experience with Application Sets and complex deployment patterns
  • Deep Linux and networking knowledge
  • Experience with distributed systems:
    Elasticsearch, Postgre

    SQL, Redis, Kafka, RabbitMQ
  • Monitoring and observability:
    Prometheus, Grafana, ELK stack, or similar
  • Strong problem-solving skills and experience debugging complex distributed systems
  • Experience with cloud security, compliance (SOC2, ISO
    27001), and secure-by-design practices
  • Excellent communication skills for working across time zones and with distributed teams
  • Self-directed with a track record of owning problems end-to-end
Nice to Have
  • Experience with multi-cloud architectures and cloud-agnostic patterns
  • Contributions to open-source infrastructure projects
  • Experience with service mesh technologies (Istio, Linkerd)
  • Knowledge of chaos engineering and reliability testing
  • Experience with cost optimization and Fin Ops practices
Why This Role
  • Work on infrastructure at…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary