Sr. Devops Engineer II
Listed on 2026-02-18
-
IT/Tech
SRE/Site Reliability, Cloud Computing, Systems Engineer
NYC Global HQ | Technology Operations - Platform Team | Hybrid (3 days in office, flexible scheduling)
Double Verify is the leading independent provider of marketing measurement software, data and analytics that authenticates the quality and effectiveness of digital media for the world's largest brands and media platforms. DV provides media transparency and accountability to deliver the highest level of impression quality for maximum advertising performance. Since 2008, DV has helped hundreds of Fortune 500 companies gain the most from their media spend by delivering best‑in‑class solutions across the digital ecosystem, helping to build a better industry.
Learn more at
The Dev Ops Platform team builds and operates the shared infrastructure foundation that powers all of Double Verify's engineering teams. We're the team behind the platforms processing billions of events daily—Kubernetes clusters spanning cloud and data centers, streaming and data systems, workflow orchestration, and the observability stack that keeps everything running. As a Senior Platform Engineer, you'll own critical infrastructure that hundreds of developers depend on every day.
This isn't about keeping the lights on—it's about building resilient, self‑service platforms that make complex distributed systems simple to operate. You'll work with cutting‑edge technologies (Kubernetes, Kafka, Aerospike, ArgoCD, Envoy) and have the autonomy to define standards and shape how DV's infrastructure evolves as the company scales. If you love building platforms that other engineers love using, this is your role.
You'll Achieve Own Company‑Wide Infrastructure Platforms
- Design, deploy, and operate Kubernetes platforms across GCP, AWS, and data center environments that serve as the foundation for 100+ engineering teams.
- Build and maintain critical shared services: streaming (Kafka), data storage (Aerospike), workflow orchestration (Airflow), and observability (Prometheus, Grafana) that process billions of events with 99.9%+ reliability.
- Create tooling and automation that transforms complex platform operations into simple self‑service workflows—empowering developers while maintaining security and stability.
- Drive CI/CD evolution by building operators, controllers, and management tools that reduce toil and accelerate deployment velocity.
- Partner with product teams from day one to ensure new features integrate cleanly and reliably into DV's infrastructure, preventing technical debt before it happens.
- Define and promote best practices for automation, observability, security, and maintainability that scale across the organization.
- Plan and deliver high‑impact infrastructure initiatives in collaboration with multidisciplinary Dev Ops, SRE, and engineering teams across US, Israel, and Europe.
- Use metrics, logs, and traces to proactively identify and resolve performance bottlenecks, turning insights into lasting improvements.
- 5+ years in Dev Ops, Platform Engineering, or SRE roles operating production infrastructure at scale.
- 3+ years hands‑on experience with Kubernetes in production environments (bonus if you've owned/managed a K8s platform).
- Strong cloud platform expertise in GCP or AWS (multi‑cloud experience valued).
- Software engineering mindset—you write code (Python, Go, Bash) to solve infrastructure problems, not just configure tools.
- Observability‑driven troubleshooting—you're comfortable diving into metrics, logs, and traces to diagnose distributed system issues.
- Platform thinking—you design for reliability, scalability, and developer experience, not just "getting it working".
- Hands‑on experience with Kafka, Aerospike, ArgoCD, or Airflow in production.
- Background with service mesh technologies (Envoy Gateway, Istio).
- Experience with Git Ops workflows and infrastructure‑as‑code (Terraform, Crossplane).
- Contributions to open‑source platform tooling or CNCF projects.
- Site Reliability Engineering (SRE) practices and culture.
- Empathy for developers—you care deeply about improving the experience of engineers using your platforms.
- Systems…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).