Senior Software Engineer, Site Reliability Tooling
Listed on 2026-01-07
-
IT/Tech
SRE/Site Reliability, Cloud Computing, Systems Engineer, IT Support
Senior Software Engineer, Site Reliability Tooling
Location: United States (Remote, with quarterly onsite sessions in San Mateo, Columbus, or Austin)
Company Stage of Funding: Public / Late-Stage
Office Type: Digital-First (Remote with Quarterly Onsites)
Salary: $163,600 – $226,400 + Bonus + Equity
Company
Description:
Our client is a leading AI-driven lending marketplace transforming how banks and credit unions evaluate and approve borrowers. Their platform delivers higher approval rates, lower loss rates, and a seamless digital-first experience—enabling more than 80% of applicants to be automatically approved without document uploads.
They operate as a digital-first company with hubs across the U.S., and employees join because they’re motivated by the mission: increasing access to fair, effortless credit by leveraging modern AI and real‑time data.
What You Will DoAs a Senior Software Engineer focused on Site Reliability Tooling
, you will play a key role in the reliability, resilience, and observability of large-scale production systems. You’ll design and build tools that empower engineering teams to maintain uptime, deploy safely, and understand system performance across complex microservice architectures.
- Champion SRE principles across engineering and promote a strong culture of service ownership and reliability.
- Build internal tooling from scratch to improve observability, monitoring, alerting, and operational workflows.
- Implement standards to monitor microservices, web apps, mobile apps, machine learning systems, databases, and Kubernetes clusters.
- Improve incident response processes, including on‑call workflows, retrospectives, and reliability reporting.
- Automate toil through infrastructure tooling, scripts, and scalable platform services.
- Help define the long‑term strategy for reliability, disaster preparedness, and operational risk mitigation.
- Collaborate across multiple engineering groups to deliver enterprise‑wide reliability initiatives.
- 6+ years combined experience in Software Engineering, Site Reliability Engineering, and/or Dev Ops.
- Strong proficiency in Python, Go, and/or JavaScript/Type Script.
- Hands‑on experience with Infrastructure-as-Code (Terraform, CDK, Cloud Formation).
- Proven background building internal tooling and applying strong software engineering fundamentals (architecture, testing, TDD).
- Strong grounding in data structures and algorithms.
- Experience with on‑call, incident response, and incident management workflows.
- Experience with modern observability tools such as Datadog, Prometheus, Grafana, Cloud Watch.
- Experience supporting high‑scale SaaS systems in microservice cloud environments.
- Ability to work cross‑functionally to drive large engineering initiatives.
- Data‑driven mindset focused on metrics, reliability, and continuous improvement.
- Experience with service mesh technologies.
- Full‑stack engineering capabilities.
- Background building tooling for observability or monitoring platforms.
- Experience leveraging LLMs / GenAI to improve SRE workflows (chatops, auto‑remediation, alert summarization, etc.).
- Base Salary: $163,600 – $226,400
- Bonus: Target bonus included
- Equity: Included
- Comprehensive medical, dental, and vision coverage with HSA contributions
- 401(k) with 100% match up to $4,500 (immediate vesting)
- Employee Stock Purchase Plan
- Life and disability insurance
- Flexible vacation, holidays, sick leave, and safety leave
- Parental, family care, and military leave
- Annual wellness, technology, and ergonomic reimbursements
- Team events, ERGs, volunteer groups
- When onsite: catered lunches, snacks, and drinks
- Quarterly team onsite sessions (travel covered)
Salary Range: $142,000-$196,000 base.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).