×
Register Here to Apply for Jobs or Post Jobs. X

Senior Front-End Network Engineer, AI Infrastructure Operations

Job in New York, New York County, New York, 10261, USA
Listing for: Nscale
Full Time position
Listed on 2026-06-22
Job specializations:
  • IT/Tech
    Network Engineer, Systems Engineer, SRE/Site Reliability, IT Infrastructure
Salary/Wage Range or Industry Benchmark: 100000 - 125000 USD Yearly USD 100000.00 125000.00 YEAR
Job Description & How to Apply Below
Location: New York

About Nscale

Nscale is the GPU cloud engineered for AI. We provide cost‑effective, high‑performance infrastructure for AI start‑ups and large enterprise customers. Nscale enables AI‑focused companies to achieve superior results by reducing the complexity of AI development. Our GPU cloud bolsters technical capabilities and directly supports strategic business outcomes, including cost management, rapid innovation, and environmental responsibility.

We thrive on a culture of relentless innovation, ownership, and accountability, where every team member takes pride in their work and drives it with excellence and urgency. As an Nscaler, you’ll build trust through openness and transparency, where everyone is inspired to do their best work. If you join our team, you’ll be contributing to building the technology that powers the future.

About

The Role

Within Nscale, the Network Operations team is responsible for the performance and reliability of the high‑speed networks that underpin our AI platforms. These front‑end networks are critical to inference workloads, cluster management, data movement, and storage connectivity.

What You’ll Be Doing
  • Owning the operational health, configuration consistency, and performance tuning of large‑scale Ethernet front‑end fabrics (leaf‑spine / Clos) supporting AI inference, management, and storage workloads
  • Leading the diagnosis and resolution of complex network incidents (P0/P1), spanning optics, routing, switching hardware, long‑haul circuits, and storage connectivity layers
  • Driving blameless postmortems and implementing preventative fixes to improve long‑term fabric stability and availability
  • Partnering with SREs to define requirements for automation and tooling, and contributing to network provisioning, validation, and monitoring systems
  • Collaborating with Network Architecture and Engineering teams to validate designs and enforce standards for routing, congestion management, firmware baselines, and change safety
  • Monitoring fabric utilisation and performance, identifying bottlenecks, and tuning for predictable latency and throughput on front‑end networks
  • Acting as a subject matter expert for cross‑functional teams on high‑speed Ethernet networking, long‑haul/DCI circuits, and storage network integration
  • Participating in an on‑call rotation supporting mission‑critical, customer‑facing infrastructure
About You
  • 5+ years of experience in network engineering, with at least 3 years operating large‑scale Ethernet data centre or cloud networks
  • Deep, hands‑on operational experience with high‑speed Ethernet fabrics in hyperscale or production environments
  • Strong expertise with Arista (EOS) and/or Nokia (7220 IXR / 7250 IXR / 7750 SR series) platforms
  • Solid understanding of modern data centre networking, including BGP, OSPF, ECMP, EVPN‑VXLAN, and leaf‑spine architectures
  • Proven experience with long‑haul circuits and DCI (dark fiber, carrier Ethernet, coherent optics)
  • Experience with storage networking over Ethernet and shared storage connectivity
  • Proven ability to troubleshoot complex network issues using Linux‑based tooling and fabric diagnostics
  • Proficiency in Python, Go, or shell scripting for automation, data analysis, or configuration management
  • Experience working in a 24/7 operational environment with a strong focus on reliability and toil reduction
Nice to Have
  • Extensive hands‑on experience with Arista or Nokia platforms at scale
  • Deep familiarity with front‑end network patterns for large AI clusters (inference traffic, management networks, and storage integration)
  • Experience operating large‑scale DCI / long‑haul optical or carrier networks
  • Strong background in network observability and telemetry systems (streaming telemetry, sFlow, Prometheus, Grafana, etc.)
  • Prior experience in automation‑first network operations or building internal tooling
What We Can Offer You
  • Highly competitive package (base + equity) with reviews every 12 months
  • Join the fastest‑growing tech startup, your chance to push boundaries, collaborate with brilliant minds, and make your mark on cutting‑edge AI.
  • Expect a dynamic progression plan tailored to your ambitions. Grow by trying new things, leading, challenging the status‑quo, and owning your impact, always with our full support.
Equal Opportunities Statement

We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio‑economic backgrounds. If there’s anything we can do to accommodate your specific situation, please let us know.

The responsibilities outlined in this job description are not exhaustive and are intended to provide a general overview of the position. The employee may be required to perform additional duties, tasks, and responsibilities as assigned by management, consistent with the skills and qualifications required for the role.

#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary