×
Register Here to Apply for Jobs or Post Jobs. X

Senior Solutions Engineer, AI Infrastructure

Job in New York, New York County, New York, 10261, USA
Listing for: VAST Data
Full Time position
Listed on 2026-06-06
Job specializations:
  • IT/Tech
    Systems Engineer, Data Engineer, Cloud Computing, Network Engineer
Salary/Wage Range or Industry Benchmark: 80000 - 100000 USD Yearly USD 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Location: New York

We're looking for a deeply technical Solutions Architect to help customers design, evaluate, and deploy infrastructure for large-scale AI, HPC, analytics, and data‑intensive workloads.

This is a customer‑facing technical role for someone who has lived inside production infrastructure. You may have been a platform engineer, infrastructure engineer, SRE, MLOps engineer, AI infrastructure engineer, storage engineer, cloud engineer, or HPC systems engineer. What matters most is that you have built, operated, or architected real systems, and can bring that credibility into customer conversations.

Our customers are building infrastructure at serious scale: GPU clusters, high‑performance storage systems, Kubernetes platforms, distributed training environments, inference platforms, data pipelines, lake houses, and large enterprise systems. You'll help them reason about architectures involving 10,000+ GPUs, 100PB+ of storage, high‑performance networking, distributed file systems, orchestration layers, and demanding production workloads.

You'll own technical discovery, architecture design, PoC planning, competitive positioning, and customer technical strategy. You'll work from the first whiteboard session through evaluation, deployment planning, and production success. You'll also partner closely with product and engineering teams to bring field feedback into the roadmap.

We're looking for someone who can go deep technically, communicate clearly, operate without a rigid playbook, and translate complex infrastructure into customer outcomes.

Responsibilities
  • Lead technical discovery with customers across infrastructure, platform, ML, data, and executive stakeholders.
  • Design architectures for large‑scale AI, HPC, analytics, and enterprise data workloads.
  • Help customers evaluate infrastructure involving GPUs, storage, networking, orchestration, and data movement.
  • Design and execute proofs of concept that validate performance, scale, reliability, and business value.
  • Translate complex technical requirements into clear solution designs, reference architectures, and deployment guidance.
  • Debug customer issues across Linux, storage, networking, Kubernetes, schedulers, GPUs, and application workloads.
  • Build technical assets, demos, runbooks, and field guidance for repeatable customer engagements.
  • Partner with sales on technical strategy, competitive positioning, and deal execution.
  • Partner with product and engineering to communicate customer requirements, gaps, and roadmap opportunities.
  • Help customers move from architecture design to production deployment.
Requirements
  • 8 to 12+ years of technical experience, with significant hands‑on infrastructure experience.
  • Experience building, operating, or architecting production platform infrastructure.
  • Strong understanding of Linux kernel implementation details, distributed systems including PAXOS and raft, storage implementations details like NAND or write amplification, networking store/forward, load balancing designs, and production operations.
  • Experience with one or more of: GPU infrastructure, large scale HPC systems, Kubernetes platforms from scratch, MLOps, storage systems, cloud infrastructure, data platforms, or large‑scale enterprise infrastructure.
  • Ability to communicate credibly with engineers, architects, technical executives, and business stakeholders.
  • Strong discovery, problem‑solving, and systems debugging skills.
  • Comfort operating in ambiguous, fast‑moving environments.
  • Interest in customer‑facing technical work, solution design, and business outcomes.
Preferred Experience
  • Experience with large‑scale GPU clusters, distributed training, inference infrastructure, or AI platforms.
  • Experience with petabyte‑scale storage or high‑performance data systems.
  • Experience with Kubernetes, Slurm, Ray, Spark, or other orchestration/scheduling systems.
  • Domain expertise with one or more of:
    Lustre, Ceph, Weka, BeeGFS, GPFS, VAST, object storage, or distributed file systems.
  • Experience with Infini Band, RoCE, RDMA, high‑performance Ethernet, or NVIDIA/Mellanox networking.
  • Direct experience with CUDA, NCCL, DCGM, GPUDirect, checkpointing, dataset staging, or model‑serving infrastructure.
  • Experience across multiple industries or customer environments.
#J-18808-Ljbffr
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary