Senior DevOps Engineer
Listed on 2026-05-31
-
IT/Tech
Cloud Computing, Systems Engineer, SRE/Site Reliability, Network Engineer
Job Overview
We are seeking an experienced Senior Dev Ops Engineer to design, implement and maintain scalable infrastructure and automation systems that support our software development and deployment processes. The ideal candidate will have strong expertise in cloud platforms, CI CD pipelines, infrastructure as code and system reliability practices. This role requires collaboration with development, security and operations teams to improve system performance, reliability and deployment efficiency.
Experience supporting AI or Voice AI infrastructure is a strong plus.
- Design, implement and maintain scalable cloud infrastructure
- Automate provisioning and configuration management using Infrastructure as Code (IaC)
- Optimize infrastructure for performance, cost efficiency and scalability
- Architect multi-region, highly available production environments
- Design and operate Kubernetes clusters handling high production traffic
- Lead capacity planning, node pool design and autoscaling strategies
- Define infrastructure standards across staging, pre-production and production
- Drive infrastructure reliability improvements based on production incidents
- Implement workload isolation strategies and resource governance policies
- Build and maintain CI CD pipelines using tools such as Jenkins, Git Lab or Git Hub
- Implement automated build, test and deployment processes
- Support continuous integration and continuous delivery best practices
- Design progressive delivery strategies, blue‑green and canary deployments
- Implement automated rollback and deployment validation mechanisms
- Define Git Ops-based deployment workflows where applicable
- Enforce artifact integrity, pipeline security scanning and policy controls
- Optimize pipelines for high deployment frequency and production safety
- Deploy and manage infrastructure on cloud platforms such as Amazon Web Services, Google Cloud or Microsoft Azure
- Manage containerized applications using Docker and orchestration systems like Kubernetes
- Support infrastructure for real‑time applications including voice or conversational AI services
- Operate Kubernetes clusters at production scale with thousands of pods
- Design autoscaling strategies using HPA, VPA and cluster autoscalers
- Implement advanced networking configurations for low‑latency systems
- Optimize infrastructure for real‑time, high‑concurrency workloads
- Lead troubleshooting of distributed systems across multiple services
- Implement monitoring, logging and alerting solutions
- Ensure high availability, system reliability and incident response processes
- Perform root‑cause analysis and system optimization
- Define and manage SLIs, SLOs and error budgets
- Own production incidents end‑to‑end, including postmortems
- Reduce alert fatigue by designing high‑signal alerting strategies
- Implement distributed tracing and deep observability practices
- Drive reliability initiatives based on root‑cause analysis
- Establish on‑call processes and reliability standards
- Minimum of 8 years of experience in Dev Ops, Site Reliability Engineering or Infrastructure Engineering
- Strong experience with Linux system administration
- Hands‑on experience with Infrastructure as Code tools such as Terraform or Cloud Formation
- Experience building and maintaining CI CD pipelines
- Strong knowledge of containerization and orchestration technologies
- Experience with cloud platforms (AWS, Azure or GCP)
- Proficiency in scripting languages such as Python, Bash or Go
- Experience with monitoring tools such as Prometheus, Grafana, ELK stack, etc.
- Strong problem‑solving and troubleshooting skills
- Experience deploying or supporting Voice AI or conversational AI infrastructure
- Familiarity with real‑time audio streaming or telephony platforms such as Twilio
- Experience working with AI platforms such as OpenAI or cloud‑based speech services
- Understanding of low‑latency systems used for voice, speech or real‑time communication platforms
We are an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender identity, sexual orientation, national origin, age, disability status, or protected veteran status.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).