Site Reliability Engineer
Listed on 2026-06-06
-
IT/Tech
Cloud Computing, SRE/Site Reliability
Dave vs. Goliath. We’re Dave.
Dave is a financial app on a mission to build products that level the financial playing field. It is redefining the financial landscape by leveraging technology to create an affordable, transparent, and user‑centric access to liquidity for millions of Americans. As a leading innovator in the U.S. financial services sector, Dave’s digital financial platform offers products designed to meet the credit needs of those underserved by traditional financial institutions.
Dave’s offerings include its flagship Extra Cash product, providing members up to $500 in short‑term advances within minutes. The company is on track to launch several new product offerings in 2026, including a Buy Now Pay Later (BNPL) option.
Dave is focused on serving Americans who are financially vulnerable or living paycheck to paycheck. Dave is leading the charge in creating a new era of credit products that prioritizes speed, affordability, and accessibility, making it the go‑to financial partner for those who need it most.
The OpportunityThis is a senior, deeply hands‑on role on a small, high‑leverage SRE team (3–4 engineers). You’ll serve as a technical anchor across cloud infrastructure and networking, shaping how reliability, automation, and performance are embedded into every layer of our platform.
You won’t just respond to incidents. You’ll design the systems that prevent them. You’ll partner closely with the Director of DevX & Infrastructure Engineering and cross‑functional teams to evolve our GCP platform in ways that support product velocity while protecting long‑term durability.
What You’ll Build and OwnLead architecture and automation across our GCP environment, ensuring reliability, scalability, security, and thoughtful cost management.
Define and improve SLIs, SLOs, and error budgets using Cloud Monitoring and Datadog — connecting reliability goals to real business outcomes.
Shape our multi‑region, disaster recovery, and capacity planning strategies so the platform holds up as we grow.
Design and optimize cloud networking, including VPC architecture, ingress/egress, Cloud Armor, VPN, and DNS to support internal systems, partner integrations, and member‑facing services.
Drive infrastructure‑as‑code and Git Ops practices using Terraform, Kubernetes, Helm, and ArgoCD to make deployments predictable and repeatable.
Mentor SREs and infrastructure engineers through design reviews, incident retros, and hands‑on collaboration — strengthening technical depth across the team.
You’ll also explore practical LLM‑driven automation where it meaningfully reduces operational toil and shortens incident resolution time.
The ImpactReliable systems mean members can access Extra Cash, banking, and credit‑building tools when they need them most. Your work directly supports trust, growth, and long‑term platform resilience.
What we’re looking forExperience & Technical Foundation
8+ years in software, infrastructure, or site reliability engineering.
5+ years of hands‑on experience operating production systems in GCP (compute, networking, storage, IAM, observability).
Deep experience with Kubernetes (GKE), Helm, containerization, Terraform (IaC), and ArgoCD.
Strong programming skills in Python, Go, or Type Script/JavaScript for automation and internal tooling.
Experience defining and operating against SLIs, SLOs, and error budgets.
Strong knowledge of relational and distributed databases (e.g., MySQL, Cloud SQL, Cloud Spanner, Redis), including performance tuning and HA strategies.
Experience leading incident response, root cause analysis, and systemic remediation.
Experience in fintech or regulated environments
Familiarity with CI tooling (GHA, Jenkins, Tekton, Circle
CI)Experience in high‑growth startups.
You take responsibility for outcomes, not just deliverables. You think in systems — how networking decisions affect latency, how reliability targets affect member trust, how cost decisions affect long‑term sustainability. You balance urgency with durability and make thoughtful trade‑offs that hold up over time.
You’re comfortable operating in ambiguity. Not everything is fully defined — and…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).