Senior Software Engineer - Storage Infrastructure Job Richardson area,Texas USA,IT/Tech

Yahoo serves as a trusted guide for hundreds of millions of people globally, helping them achieve their goals online through our portfolio of iconic products. For advertisers, Yahoo Advertising offers omnichannel solutions and powerful data to engage with our brands and deliver results.

About the Team

Our platform is the foundational identity and data layer for 900M+ monthly active users, serving 2.5B+ profiles at massive scale. We are building a predictive, identity-centric insights engine-ensuring our audience is understood with precision to deliver hyper-personalized experiences and advertising solutions across all our digital properties.

Our mission centers on first-party data strategy: capturing, enriching, and activating audience signals to build a 360-degree view of every user. We operate under a Privacy-by-Design philosophy, adhering to global regulations (GDPR, CCPA) and industry security standards, while leveraging a cloud-native stack across GCP (Big Query, Spanner, Dataflow, Composer, GKE) and AWS, with modern MLOps practices to deliver measurable business impact.

About

the Role

As a Senior Software Engineer, you will design and optimize the foundational storage layer powering our 2.5B+ profile dataset. Your work on Cloud Spanner schema design, Valkey (Redis-compatible) caching strategies, and multi-region replication ensures sub-10ms data access for APIs serving millions of requests per second, directly enabling hundreds of millions in annual advertising revenue.

You will build and maintain petabyte-scale storage infrastructure with 99.99% availability, implementing efficient read/write patterns, multi-region replication, and disaster recovery mechanisms. Your expertise in distributed databases and caching systems is critical to balancing performance, cost, and reliability at massive scale.

This role demands deep knowledge of Cloud Spanner internals, distributed caching architectures, and production database operations will collaborate closely with API, Ingestion, and SRE teams to ensure optimal data access patterns while maintaining data durability and system reliability.

Key Responsibilities

Design and optimize Cloud Spanner schemas for efficient profile storage, query patterns, and write throughput at 2.5B+ profile scale
Implement Valkey (Redis-compatible) caching strategies achieving sub-10ms read latency for hot data access patterns
Build multi-region Spanner replication and automated failover mechanisms ensuring 99.99% availability and disaster recovery
Optimize Spanner read/write throughput, reduce hot-spotting, and improve query performance through index design and query optimization
Implement comprehensive monitoring and alerting systems tracking storage health, latency percentiles (p50, p95, p99), capacity utilization, and cost
Collaborate with API team on efficient data access patterns, query optimization, and caching strategies for activation endpoints
Partner with Ingestion team on high-throughput write patterns, batch loading strategies, and schema evolution without downtime
Design backup, point-in-time recovery, and disaster recovery procedures for critical user profile data
Troubleshoot production storage issues including performance degradation, hot-spotting, lock contention, and capacity constraints
Work with SRE teams on capacity planning, autoscaling strategies, cost optimization, and infrastructure efficiency
Implement cache invalidation strategies, cache warming, and distributed caching patterns for consistent data access
Create comprehensive documentation for storage architecture, operational runbooks, disaster recovery procedures, and on-call playbooks

Required Qualifications Education

Bachelor's degree in Computer Science, Engineering, or related technical field

Experience

5+ years software engineering experience building production systems
3+ years hands-on experience with distributed databases or large-scale storage systems
2+ years with GCP infrastructure (Spanner, Memory store, Cloud Monitoring) or AWS equivalents (Dynamo

DB, Elasti Cache)

Technical Skills

Strong proficiency in Java, Go, or Python for infrastructure and database tooling development
Hands-on experience with…