Senior DevOps Engineer
Listed on 2026-01-07
-
IT/Tech
Systems Engineer, Cloud Computing
Evidation creates new ways to measure and improve health in everyday life—making proactive, personalized, and truly human-centered healthcare possible. By connecting directly with millions of individuals, Evidation harnesses real-world data to deeply understand health experiences, rapidly and dation’s privacy-centric digital health measurement and engagement platform uses data science and machine learning to translate these everyday insights into high-impact health guidance, treatments, and tools.
Founded in 2012, Evidation is headquartered in California with employees working around the globe.
Senior Dev Ops Engineer with experience building product oriented cloud infrastructure, application delivery, observability, and operational excellence for Evidation’s platform. This role will design, build, and continuously improve the tooling, automation, and cloud-native systems that power our production workloads. This role requires deep expertise in AWS, Kubernetes (EKS), Infrastructure as Code, CI/CD, containerization, and modern Dev Ops automation patterns. Success in this role means raising the reliability and repeatability of systems, enabling engineering teams to move faster, and driving high-quality operational outcomes across the organization.
Preferred locations:
Santa Barbara, Southern California
- Design, build, and maintain highly available, scalable infrastructure on AWS using Infrastructure as code.
- Design and operate multi-tenant Kubernetes environments running on EKS, including cluster operations, workload management, autoscaling, and cost-optimized configurations.
- Drive Infrastructure-as-Code (IaC) best practices using Terraform and Pulumi, including modularization, testing, versioning, and safe deployment patterns.
- Contribute to CI/CD ecosystem using Git Hub Actions, reusable workflows, and secure secrets management; ensure fast, resilient, and traceable deployment pipelines.
- Build and maintain containerization based software delivery pipeline leveraging Docker, Helm charts, and Github workflows.
- Define and continuously improve monitoring, alerting, dashboards, and logging using Datadog.
- Evaluate operational data to identify performance, stability, and cost-efficiency opportunities.
- Provide advanced support for major incidents, performing root cause analysis, writing clear postmortems, and ensuring long-term corrective actions.
- Apply a security-first mindset to infrastructure architecture, IAM, network boundaries, and workload configurations.
- Implement work in alignment to controls in support of ISO 27001, SOC 2, HIPAA, and other regulated requirements.
- Collaborate with Security to operationalize secure-by-default infrastructure patterns.
- Collaborate with Engineering, Data, and Delivery teams to define requirements, translate technical needs, and deliver scalable solutions.
- Facilitate knowledge sharing through documentation, playbooks, incident reviews, and architectural discussions.
- Identify opportunities to add value beyond immediate requests—improving reliability, simplifying processes, and reducing operational load.
- 8+ years of Dev Ops, SRE, Platform Engineering, or relevant experience supporting production cloud systems.
- Expert-level experience with AWS services.
- Expert-level experience managing Kubernetes environments, including Helm, KEDA, cluster lifecycle, and multi-environment deployments.
- Advanced CI/CD experience using Git Hub Actions (workflows, reusable workflows, OIDC auth, environments) or similar technology.
- Expert-level containerization skills (Docker, image optimization, registry management).
- Strong proficiency with Terraform and Pulumi for Infrastructure as Code.
- Hands-on experience with AI-assisted development tools (VSCode, Git Hub Copilot, code generation workflows).
- Strong proficiency with scripting and coding automation tools. Experience in more than one of:
Bash, Python, Ruby, or Go. - Experience building reliable, observable systems using Datadog (metrics, logs, traces, monitors) or similar solution.
- Strong understanding of distributed systems, networking, autoscaling, and operational patterns in cloud-native architectures.
- Strong…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).