AI Platform & Inference Suite Engineer/Senior level KSA Job Riyadh area,Riyadh Region Saudi Arabia,IT/Tech

Position: AI Platform & Inference Suite Engineer (Staff/Senior Staff level KSA

Company

Qualcomm Middle East Information Technology Company LLC

Job Area

Engineering Group >
Software Engineering

General Summary

Qualcomm is enabling a world where everyone and everything can be intelligently connected. You interact with products and technologies made possible by Qualcomm every day, including intelligent edge devices, next‑generation computing platforms, and advanced AI solutions. Qualcomm’s leadership in AI, high‑performance compute, and connectivity is driving innovation across cloud, edge, and data center environments, delivering scalable, power‑efficient platforms that power the next generation of intelligent infrastructure.

About

the Role

Qualcomm is seeking a Machine Learning Applications Engineer – AI Inference & Model Optimization to support the enablement of rack‑scale deep learning workloads on advanced Qualcomm AI inference accelerators. These accelerators utilize Qualcomm's expertise in hardware‑accelerated AI to deliver high‑performance, energy‑efficient generative AI and computer vision inference solutions for modern data centers.

What You’ll Do

Deploy, optimize, and scale deep learning AI models onto accelerator‑based data center platforms.
Handle model conversion workflows and quantization techniques (INT8 / mixed precision).
Perform runtime integration and optimization.
Integrate ML models onto Qualcomm’s Cloud AI ML stack from frameworks such as PyTorch, Tensor Flow, and ONNX.
Drive improvements in model throughput, latency, and accuracy, with clear trade‑off analysis.
Build, test, and deploy scalable inference pipelines using serving frameworks such as vLLM, TGI, and Triton.
Optimize workloads for LLM and GenAI models across multi‑SoC and multi‑card architectures.
Collaborate with engineering teams to analyze and refine training and inference for advanced deep learning applications.
Identify bottlenecks across compute, memory, and runtime, and guide optimization strategies.
Contribute to Qualcomm’s Cloud AI Git Hub repository and developer documentation.
Develop and integrate end‑to‑end ML application pipelines with customer frameworks and libraries.
Act as a trusted technical advisor for customers deploying AI workloads.
Engage in hardware sizing and architecture discussions, aligning model requirements with infrastructure capabilities.
Provide technical guidance on AI model selection, deployment feasibility, and system architecture expectations.
Lead discussions on model capabilities and limitations based on real customer use cases.
Assess and evaluate AI model requirements and recommend alternative model approaches.
Align model characteristics with accelerator and system capabilities.
Support customers in defining model selection strategies based on deployment realities.
Evaluate performance characteristics of AI models in production scenarios, including throughput, latency, and concurrency.
Guide architecture decisions around scaling strategies, hardware deployment sizing, and capacity planning.
Drive discussions around end‑to‑end AI pipelines, including multi‑model workflows and data preprocessing/post‑processing stages.
Lead or support model trade‑off analysis and validation in deployment environments.
Collaborate with customers to define inference assumptions and model sizing strategies for large‑scale workloads.

Required Qualifications

Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or related field (or equivalent experience).
10–15+ years of experience in deep learning model development or deployment on CPUs/GPUs/ASICs.
Inference systems and optimization experience.
Experience with data center or edge AI platforms.
Strong experience with model quantization and optimization techniques.
Proficiency with AI model frameworks (PyTorch, Tensor Flow).
Experience in model deployment pipelines.
Excellent C/C++/Python programming and software design skills.
Hands‑on expertise with Linux‑based systems, low‑level software, drivers, and system bring‑up.
Proven ability to analyze and optimize model performance in production environments.
Solid understanding of AI inference hardware constraints and system level performance bottlenecks.
Strong communication skills…

AI Platform & Inference Suite Engineer​/Senior level KSA

AI Platform & Inference Suite Engineer/Senior level KSA