×
Register Here to Apply for Jobs or Post Jobs. X

AI Infrastructure Benchmarking and Network Validation Engineer

Job in San Jose, Santa Clara County, California, 95199, USA
Listing for: Cisco Systems, Inc.
Full Time position
Listed on 2026-06-30
Job specializations:
  • IT/Tech
    Systems Engineer, Network Engineer
Salary/Wage Range or Industry Benchmark: 199700 - 254600 USD Yearly USD 199700.00 254600.00 YEAR
Job Description & How to Apply Below

The application window is expected to close on: 08/28/2026

Job posting may be removed earlier if the position is filled or if a sufficient number of applications are received.

Meet the Team

The Hyperscaler Routing team within Cisco Networking focuses on delivering advanced routing solutions tailored for hyperscale environments. The team leverages Cisco Silicon One technology, which converges routing and switching silicon to provide high capacity, power efficiency, and programmability. This enables hyperscalers to architect distributed AI environments with seamless scalability and security. Their work supports cutting‑edge AI infrastructure and cloud‑scale networking, addressing the demands of AI traffic and next‑generation hyperscale routing systems.

Your Impact

As a key contributor to Cisco’s AI/ML infrastructure initiatives, you will plan, execute, and analyze comprehensive benchmarks on Cisco switches, focusing on throughput, latency, congestion, incast, failover, path diversity, and workload performance to ensure optimal AI/ML network operations.

You will be guiding AI/ML workload deployments from initial scoping and test planning through execution and benchmark analysis, ensuring success criteria are met. Your role includes developing AI‑driven automation workflows to streamline network development, operations, and implementations.

You will define rigorous benchmark methodologies, test plans, KPIs, pass/fail criteria, and reporting structures for AI RoCE Ethernet fabrics, benchmarking fabric performance across critical metrics including latency, throughput, path diversity, ECMP and link utilization, congestion behavior, packet drops, retransmissions, queue occupancy, and recovery behavior. You will run and analyze performance tests using industry‑standard tools such as NCCL, RCCL, _bw, _bw, _bw, _lat, netperf, iperf, MPI, OSU benchmarks, and microburst test methods.

You will validate switch ASIC features including buffers, schedulers, QoS/queuing, ECMP behavior, telemetry, hashing, traffic distribution, and congestion visibility.

Owning switch OS configuration and automation, you will utilize SONiC, NX‑OS, Ansible, Python, Bash, Git, and related tooling to implement and validate advanced features such as SRv6, segment routing, uSID, Adj‑SID, and policy‑based pathing as required. You will document PoC architecture, benchmark methodologies, topology diagrams, configurations, results, findings, and recommendations.

This role empowers you to shape the future of AI infrastructure networking by delivering scalable, high‑performance, and resilient network fabrics that meet the stringent demands of AI/ML workloads, driving innovation and customer success at Cisco.

Minimum Qualifications
  • Bachelors + 7 years of related experience, or Masters + 4 years of related experience.
  • Python for automation experience.
  • Experience with L2/L3 network protocols such as BGP, OSPF, EVPN, VxLAN, IPv6 or similar.
  • Experience with Traffic tools such as Spirent, IXIA or similar.
  • Docker or Kubernetes experience.
  • Experience with network testing and validation.
Preferred Qualifications
  • Clear written and verbal communication skills as well as documentation skills.
  • SONiC, NxOS, Linux or other open source network operating systems experience.
  • Deep understanding of Leaf‑spine fabric and troubleshooting them.
  • Experience with Cisco Nexus Dashboard and related automation tools for provisioning, managing and troubleshooting the fabric.
  • Experience handling complex network segmentation, security policies, and multi‑site fabric designs.
  • Experience with RDMA, RoCEv2, PFC, ECN, congestion control, QoS, buffer behavior, and lossless Ethernet concepts.
Why Cisco?

At Cisco, we’re revolutionizing how data and infrastructure connect and protect organizations in the AI era – and beyond. We’ve been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.

Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions.…

To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary