Software Engineer, Observability
San Francisco, San Francisco County, California, 94199, USA
Listed on 2026-02-16
-
Software Development
Software Engineer, Cloud Engineer - Software, DevOps
About Pinterest:
Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we’re on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product.
Discover a career where you ignite innovation for millions, transform passion into growth opportunities, celebrate each other’s unique experiences and embrace the flexibility to do your best work. Creating a career you love? It’s Possible.
We're seeking an exceptional Staff Software Engineer to join our Observability team s role combines deep technical expertise in distributed systems and data engineering with a product-oriented mindset to build world-class observability solutions that empower our engineering organization. As a Staff Engineer on the Observability team, you'll be responsible for designing and building the infrastructure and tools that provide visibility into Pinterest's large-scale distributed systems, helping thousands of engineers understand, debug, and optimize their services.
What you’ll do:- Define and execute the observability roadmap, treating it as a product. Understand engineering team needs and translate them into technical solutions with measurable impact.
- Architect, build, and scale distributed observability infrastructure (metrics, logs, traces) to handle massive volumes across Pinterest's distributed systems.
- Build high-performance data pipelines and storage for real-time and historical telemetry analysis at Pinterest scale.
- Champion Best Practices:
Establish observability standards and patterns across the organization, making it easy for teams to instrument their services and gain actionable insights. - Technical Leadership:
Mentor engineers, lead architectural reviews, and influence technical decisions across teams to improve overall system reliability and performance. - Cross-functional Collaboration:
Partner with SRE, Infrastructure, Product Engineering, and other teams to understand pain points and deliver solutions that improve developer productivity and system reliability. - Innovation:
Stay current with observability trends and technologies, evaluating and adopting cutting-edge tools and techniques to keep Pinterest at the forefront.
- Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
- Product Mindset:
Demonstrated ability to work backwards from customer needs —understanding user needs, prioritizing features, measuring success, and iterating based on feedback. Experience building internal platforms or tools with strong adoption. - Distributed Systems Expertise: 7+ years of experience designing and operating large-scale distributed systems with deep understanding of consistency, availability, scalability, and failure modes.
- Data Engineering
Skills:
Strong background in building data pipelines, working with time-series databases, columnar storage, stream processing (Kafka, Flink, etc.), and data modeling at scale. - Observability Domain Knowledge:
Hands‑on experience with modern observability tools and practices including metrics, logging, tracing, and profiling. Familiarity with Open Telemetry, Prometheus, Grafana, or similar technologies. - Programming Proficiency:
Expert-level coding skills in languages like Java, Python, Go, or Scala with ability to write production‑quality code. - Systems Thinking:
Ability to see the big picture while managing complex technical details, balancing trade-offs between cost, performance, and reliability. - Experience building observability platforms from the ground up or significantly scaling existing solutions.
- Familiarity with cloud-native architectures and technologies (Kubernetes, service mesh, etc.)
- Track record of driving adoption of internal platforms through excellent documentation, UX, and developer advocacy.
- Experience with machine learning or anomaly detection applied to observability use cases.
- Strong communication skills with ability to influence stakeholders at all levels.
- Contributions to open-source observability projects, a plus.
- W…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).