More jobs:
Sr GenAI Infra Specialist SA, AWS WWSO Startup
Job in
Herndon, Fairfax County, Virginia, 22095, USA
Listed on 2026-06-01
Listing for:
Amazon
Full Time
position Listed on 2026-06-01
Job specializations:
-
IT/Tech
Systems Engineer, AI Engineer, Cloud Computing, Data Engineer
Job Description & How to Apply Below
Do you want to help define the future of technology on AWS Generative AI as part of the Specialist Solutions Architect team in the Go-To-Market (GTM) Startup team? Are you passionate about AI infrastructure and helping customers understand the complexities of training and serving large-scale models?
You will be part of the core Specialist Organization focused on Startup Customers GenAI and Go-to-Market (GTM) team, focused on AI infrastructure for model training and inference optimization. You will be responsible for defining, building, and deploying targeted strategies to accelerate adoption of AWS compute, networking, and ML platform services with lighthouse Frontier AI model builders across Startups companies in different industry verticals.
This role sits at the intersection of AI infrastructure architecture and model optimization - you will help customers understand hardware requirements and complexity (GPU, Trainium, networking), while also providing deep expertise in optimization of models and techniques for both inference serving and distributed training at scale.
AWS Specialist Solutions Architects (SSAs) are technologists with deep domain-specific expertise, able to address advanced concepts and feature designs. As part of the AWS sales organization, SSAs work with customers who have complex challenges that require expert-level knowledge to solve. SSAs craft scalable, flexible, and resilient technical architectures that address those challenges.
Key job responsibilities
- Work directly with the most important and exciting Startup customers in the GenAI model training and inference space, helping them adopt and scale large-scale workloads (e.g., frontier models, models, multi-modal systems, optimization) on AWS
- Advise customers on AI infrastructure requirements and trade-offs including GPU/Trainium selection, cluster topology, storage, networking (EFA), and cost optimization for training and inference
- Provide deep technical guidance on inference optimization model serving architectures (self-managed on EKS, Sage Maker endpoints, Sagemaker Hyperpod Serving), batching strategies, quantization, model parallelism, and latency/throughput tradeoffs
- Provide deep technical guidance on training optimization distributed training strategies, framework selection (PyTorch, JAX, NeMo), Sage Maker Hyper Pod, Slurm/PCS integration, checkpointing, and data pipeline design
- Guide customers on GPU and accelerator profiling identifying bottlenecks (compute, memory, I/O), optimizing utilization, and tuning system-level performance
- Help customers understand and apply model optimization techniques fine-tuning approaches (LoRA, QLoRA, full fine-tuning), RLHF/DPO, knowledge distillation, and efficient serving techniques (vLLM, Tensor
RT-LLM, Triton)
- Help Go-To-Market Specialist define and drive strategy on assets that impact growth through market sizing, building an opportunity pipeline, creating technical content to train field teams, and establishing thought leadership
- Develop demos, proof-of-concepts, reference architectures, and benchmarks that demonstrate AWS infrastructure value proposition for GenAI workloads
- Collaborate with product teams (EC2, Trainium/Inferentia, Sage Maker, EKS, PCS, EC2) to shape product vision, prioritize features, and represent the voice of the customer
- Work with account teams, research scientists, ISVs, framework communities, and model providers to drive implementations and accelerate innovation
A day in the life
As the ideal candidate, you possess a deep infrastructure and systems background combined with hands-on ML/AI expertise that enables you to lead engagements with frontier AI labs, startups, and large enterprises. You understand:
- The hardware layer: GPU architectures (NVIDIA A100/H100/B200, AWS Trainium/Inferentia), NVLink, EFA networking, storage hierarchies (FSx for Lustre, S3), and how they interact at scale
- The orchestration layer:
How to run large-scale training at least on one or more of EKS/Kubernetes, Sage Maker Hyper Pod, Slurm/PCS - including cluster management, job scheduling, fault tolerance, and elastic scaling
- The framework/model layer:
Di…
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×