DevOps Engineer, Infrastructure
Listed on 2026-06-16
-
IT/Tech
Systems Engineer, Cloud Computing: Infrastructure & Operations, SRE/Site Reliability, IT Project Manager
Overview
As a Dev Ops Engineer, Infrastructure you will be responsible for deploying product updates, identifying production issues, and implementing integrations that meet our clients' needs.
Starting base pay for this role is between $100,000 and $122,000. The actual base pay is dependent upon many factors, such as transferable skills, work experience, business needs, training, location, and market demands. The base pay range is subject to change and may be modified in the future. This role will be eligible for a bonus as well as competitive medical, dental, and vision benefits, wellness reimbursement, life insurance, and a 401(k) with company match.
We offer vacation and sick leave benefits (under a flexible time off policy in most states).
- Leads Dev Ops delivery for cloud-native applications, translating architecture and product requirements into infrastructure, CI/CD, and operational runbooks across development through multi-region production environments
- Designs, implements, and maintains AWS infrastructure as code using Terraform across services such as EKS, ECS Fargate, Lambda, API Gateway, WAF, IAM, Cloud Watch, Open Search, and related platform services
- Owns multi-environment infrastructure promotion workflows, including dependency ordering, shared-resource prerequisites, environment-specific configuration, and post-apply validation
- Builds and enhances CI/CD pipelines in Azure Dev Ops and Git Hub using shared pipeline templates, service-specific pipeline configurations, variable groups, deployment parameters, and multi-environment rollout patterns.
- Supports high-velocity delivery for AI initiative projects by helping design and evolve a standardized CI/CD platform that enables Day-1 deployment from initial repo creation through infrastructure provisioning, application deployment, observability, and production readiness
- Partners with Architecture and Engineering teams to establish Day-1 observability for new services, including centralized logging, metrics, tracing, dashboards, alerts, and operational support expectations
- Coordinates cross-repo releases for complex platforms, including Terraform applies, service deployments, third-party tool integrations, release sequencing, and stakeholder communication
- Onboards new services and greenfield projects by analyzing deployment requirements, authoring deployment runbooks and architecture documentation, and creating Terraform work spaces and pipeline scaffolding
- Identifies, diagnoses, and resolves production and pre-production issues using New Relic, Datadog, Kubernetes CLI tools, Cloud Watch logs, EKS, ECS and Lambda diagnostics, Open Search, pipeline logs, and infrastructure state analysis
- Communicates incident status, risks, blockers, and resolution plans to engineering teams and stakeholders based on severity and business impact
- Implements and maintains security controls and compliance posture, including IAM least-privilege design, Wiz pipeline integration, supply-chain risk review, and Terraform resource governance
- Defines and improves operational readiness for go-live services, including monitoring gaps, Open Search access patterns, VPN and private DNS routing, deployment validation, rollback planning, and incident response playbooks
- Reduces build, deployment, and configuration complexity through automation, standardized repo templates, environment-aware pipeline parameters, shared ECR patterns, central infrastructure standards, and repeatable environment bootstrapping
- Researches and evaluates platform modernization options such as shared databases, Kubernetes, Helm, Argo CD, automated service onboarding, and AI-assisted IaC and CI/CD workflows
- Presents data-driven platform improvement proposals to leadership and partner teams
- Coordinates with Dev Ops Engineers, Cloud Architects, and Engineering teams on Terraform module design, environment-specific promotion, and safe production change practices
- Manages incident tickets and change requests within defined SLAs, including CAB and release coordination for production promotions
- Creates and maintain technical documentation, including Terraform READMEs, deployment runbooks, architecture…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).