×
Register Here to Apply for Jobs or Post Jobs. X

Network Reliability Engineer - Decentralized - Computing Leader

Job in Seattle, King County, Washington, 98127, USA
Listing for: Andiamo
Full Time position
Listed on 2025-10-26
Job specializations:
  • IT/Tech
    Systems Engineer, Network Engineer
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Position: Network Reliability Engineer - Decentralized High-Performance Computing Leader

Network Reliability Engineer - Decentralized High-Performance Computing Leader About The Role

We’re searching for an expert Network Reliability Engineer to architect, optimize, and operate the high-performance network fabrics that power large-scale AI and HPC workloads. You’ll be at the core of the engineering team responsible for building ultra-low-latency, high-throughput networks that connect thousands of GPUs and servers across global datacenters. This isn’t a traditional networking role — it’s an opportunity to shape the performance backbone of some of the world’s most demanding compute environments.

You’ll blend deep networking expertise with software engineering to deliver systems that are not only reliable and scalable but also faster and more efficient than ever before.

What You’ll Do
  • Engineer next-generation network performance:
    Fine‑tune TCP/IP, RDMA (RoCE), kernel‑by‑pass technologies (DPDK, XDP, eBPF), and NIC offloads to push latency and throughput to their physical limits for high‑performance computing workloads.
  • Deploy and scale at massive capacity:
    Roll out and optimize large‑scale network fabrics across datacenters using top‑tier hardware (Arista, NVIDIA/Mellanox, Juniper, and more). Configure advanced BGP/EVPN topologies, spine‑leaf architectures, and congestion management for lossless transport.
  • Automate network intelligence:
    Build telemetry pipelines and automated systems for real‑time performance monitoring, packet‑loss detection, and predictive congestion analysis across complex environments.
  • Debug at the deepest levels:
    Lead investigations into packet loss, latency anomalies, and congestion hot spots — diving into kernel traces, switch firmware, and flow control mechanisms to pinpoint and resolve issues.
  • Collaborate with the industry’s best:
    Work directly with hardware and silicon vendors to debug firmware, optimize RDMA and RoCE paths, validate optics, and integrate emerging technologies like 800G+ links and CPO/LPO networking.
  • Design for resilience and reliability:
    Simulate large‑scale network failures, run game‑day exercises, and turn lessons learned into robust automation, playbooks, and SLOs that drive measurable reliability improvements.
Who You Are
  • 7+ years of experience in network engineering, SRE, or performance infrastructure roles — ideally within AI, HPC, or large‑scale cloud environments.
  • Deep understanding of the Linux networking stack, including kernel‑level debugging, TCP/IP, Infini Band, and RoCE.
  • Proven hands‑on experience managing multi‑layer datacenter networks, network overlays (VXLAN, Geneve), and multi‑vendor environments (Arista, NVIDIA/Mellanox, Juniper, etc.).
  • Strong programming proficiency in Python, Go, or Rust, and experience with Infrastructure‑as‑Code and modern CI/CD practices.
  • Practical knowledge of DPDK, XDP, eBPF, and hardware acceleration frameworks used in low‑latency networking.
  • Demonstrated success in building and scaling high‑performance, low‑latency network architectures for data‑intensive systems or compute clusters.
Why This Role Matters

Modern AI and high‑performance computing workloads push data through networks at unprecedented speed and scale. This role sits at the intersection of innovation and reliability — where every microsecond and packet matters. As a Senior Network Reliability Engineer, you’ll design and operate the connective tissue of advanced compute infrastructure, ensuring the world’s most powerful systems run seamlessly, efficiently, and at peak performance.

#J-18808-Ljbffr
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary