Principal Network Observability Data Architect
Listed on 2026-02-19
-
Manufacturing / Production
Systems Engineer
Who are we?
Equinix is the world’s digital infrastructure company®, shortening the path to connectivity to enable the innovations that enrich our work, life and planet.
A place where tech thinkers and future builders turn bold ideas into breakthrough experiences, we welcome your unique perspective. Help us challenge assumptions, uncover bias, and remove barriers—because progress starts with fresh ideas. You’ll find belonging, purpose, and a team that welcomes you—because when you feel valued, you’re empowered to do your best work.
Job SummaryAs the Principal Engineer for NPE Observability, you are the lead architect for the distributed systems that ingest, store, and analyze our global network state. You will bridge the gap between network protocols and big data patterns, designing high-performance ingestion engines capable of handling trillions of telemetry points. Your role is to architect massively parallel processing pipelines and stateful stream processing frameworks that enable real-time anomaly detection across our global infrastructure.
You will build the high-throughput, low-latency data fabric that makes our network self-aware.
Strategic Leadership & AI Evolution
Visionary
Roadmap:
Define the multi-year architecture for a unified on-prem and telemetry ecosystem, evolving our global infrastructure into a self-healing "intelligent network"Network Data Strategy:
Direct the lifecycle of network data, from GNMI/SNMP ingestion to structured storage, ensuring telemetry is normalized, consistent, and optimized for large-scale AI modelingAI Systems Architecture:
Integrate industry-frontier practices in LLMOps, tool-use frameworks (MCP), and agentic workflows to accelerate incident root-cause analysis of telemetry data and automated remediation/alerting
Advanced Network Telemetry & Big Data
Systems Mastery:
Architect the interplay between network and application layers, optimizing the computational processing across different network protocolsDistributed Data Pipelines:
Design high-throughput, resilient pipelines proposing architectures to ingest trillions of events for predictive AIOpsTechnical Excellence & Mentorship
Architectural Integrity:
Enforce SOLID and Clean Architecture principles across network telemetry pipelines, observability data stores, and AI agent orchestration layers to ensure reliable, low-latency insight into network healthCultural Leadership:
Act as a force multiplier by bridging Net Ops, Dev Ops, and Security to standardize the “MELT” (Metrics, Events, Logs, Traces) strategy across all global interconnection pointsMentorship:
Level up Staff and Senior engineers through deep-dive design reviews and strategic coaching on network centric observability best practices
Experience:
10+ years architecting distributed systems, high-scale observability platforms, or mission-critical network softwareEducation:
Bachelor’s degree in Computer Science or Computer Engineering, or a related fieldCore
Languages:
Expert-level proficiency in Java and Go for building high-performance systemsNetworking & Infrastructure:
Deep fluency in gNMI, SNMP and Flow protocols; extensive experience architecting on Kubernetes, Jenkins and ArgoCD. Proficiency and experience working with service provider networking technologies and protocols, including BGP, IS-IS, MPLS, QoS, EVPN, VXLANObservability Stack / Big data stack:
Expert experience architecting distributed systems and high-scale observability fabrics, with a specialized mastery of OLAP and Time-Series Database architectures such as Click House, Prometheus/Thanos, or Influx
DB. You have a proven track record of designing schemas and tuning storage engines for high-cardinality network telemetry (trillions of events), utilizing Kappa/Lambda patterns and stateful stream processing via Apache Flink and KafkaAI Expertise:
Proven experience developing AI Agents using LLMs, including Function Calling, RAG, and agentic orchestration
Architect the Intelligence of the Global Internet. At Equinix, you aren’t just building a monitoring tool; you are designing the "central nervous…
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).