DevOps & Site Reliability Engineer Job Houston area,Texas USA,IT/Tech

Position Title:

DEVOPS & SRE ENGINEER

Location:

HOUSTON, TX

FLSA Class: EXEMPT

Responsible to:
Director of Software Engineering

Position Summary

Dev Ops / Site Reliability Engineer to implement and evolve the infrastructure, deployment pipelines, and reliability posture of our systems. You'll work closely with engineering teams to build scalable, observable, and resilient infrastructure while driving a culture of operational excellence.

Essential Duties and Responsibilities

Design, build, and maintain cloud infrastructure
Manage and optimize Kubernetes clusters and containerized workloads in production
Develop and maintain infrastructureascode using Terraform (or equivalent tooling)
Build and improve CI/CD pipelines to enable fast, safe, and reliable deployments
Implement and maintain monitoring, alerting, and observability systems (Prometheus, Grafana, Datadog, or similar)
Define and track SLIs/SLOs, participate in incident response, root cause analysis, and blameless postmortems
Identify and eliminate toil through automation and self service tooling
Configure and maintain onprem baremetal servers and Linux-based infrastructure
Configure, maintain, and optimize virtualized assets
Collaborate with development teams on system design, capacity planning, and performance optimization
Participate in oncall rotations and ensure production readiness of new services

Other Requirements

4+ years of experience in Dev Ops, SRE, or infrastructure engineering roles
Strong experience with at least one major cloud provider (AWS, GCP, or Azure AWS preferred)
Deep hands-on experience with Kubernetes and Docker in production environments
Proficiency with infrastructureascode tools, particularly Terraform
Experience building and maintaining CI/CD pipelines (Git Hub Actions, Git Lab CI, Jenkins, or similar)
Solid understanding of monitoring and observability (metrics, logs, traces)
Strong scripting skills (Bash, Python, or Go)
Experience with incident management, SLObased reliability practices, and capacity planning
Strong Linux systems administration skills (Ubuntu, RHEL/CentOS, or similar)
Experience with virtualization platforms including VM provisioning, storage, networking, and cluster management
Solid understanding of networking, DNS, load balancing, and security fundamentals

Nice To Have

Contributions to internal developer platforms or platform engineering initiatives
Proxmox VE experience
Certifications in cloud platforms (AWS SA, CKA, etc.)

Equal Opportunity Employer Statement

The above statements are intended to describe the general nature and level of work being performed by employees assigned to this classification. All personnel may be required to perform duties outside of their normal responsibilities from time to time, as needed.

Volta Grid is an Equal Opportunity Employer that does not discriminate on the basis of actual or perceived race, creed, color, religion, alienage or national origin, ancestry, citizenship status, age, disability or handicap, sex, marital status, veteran status, sexual orientation, genetic information, arrest record, or any other characteristic protected by applicable federal, state or local laws.

Our management team is dedicated to this policy with respect to recruitment, hiring, placement, promotion, transfer, training, compensation, benefits, employee activities, and general treatment during employment.

#J-18808-Ljbffr