×
Register Here to Apply for Jobs or Post Jobs. X

ML Operations & Customer Support Engineer​/Senior level KSA

Job in Riyadh, Riyadh Region, Saudi Arabia
Listing for: Qualcomm
Full Time position
Listed on 2026-06-21
Job specializations:
  • IT/Tech
    SRE/Site Reliability, Cloud Computing: Infrastructure & Operations, Systems Engineer
Salary/Wage Range or Industry Benchmark: 150000 - 200000 SAR Yearly SAR 150000.00 200000.00 YEAR
Job Description & How to Apply Below
Position: ML Operations & Customer Support Engineer, Staff/Senior Staff level KSA

Company

Qualcomm Middle East Information Technology Company LLC

Job Area

Engineering Group >
Software Engineering

Overview

Qualcomm is seeking a Machine Learning Operations & Customer Support Engineer within the Customer Engineering team to support strategic customers deploying AI inference workloads on advanced Qualcomm AI inference accelerators.

This customer‑facing, production‑critical role focuses on ensuring maximum system uptime, reliability, and performance while resolving support cases within defined SLAs/KPIs. The role requires deep expertise across ML inference pipelines, systems troubleshooting, and data center operations, working closely with customers, internal engineering, and product teams.

What You’ll Do Customer Support & SLA Ownership
  • Act as the primary technical escalation point for customer issues related to AI inference workloads
  • Own end‑to‑end case management, ensuring resolution within agreed SLAs and KPIs
  • Drive incident response, triage, and root cause analysis (RCA)
  • Provide timely and transparent communication to customers on issue status and resolution
  • Maintain high levels of customer satisfaction and service reliability
Uptime, Reliability & Operations
  • Ensure high availability and uptime of customer AI deployments (rack‑scale systems)
  • Monitor system health, performance metrics, and workload behavior
  • Implement and manage failover, redundancy, and resiliency mechanisms
  • Proactively identify risks and implement preventative actions
AI Inference Workload Support
  • Support deployment, optimization, and troubleshooting of ML inference pipelines
  • Debug issues across model, runtime, system, and hardware layers
  • Analyze model performance (latency, throughput, accuracy trade‑offs) in production
  • Support frameworks such as PyTorch, Tensor Flow, ONNX, and model conversion flows
  • Assist in model optimization techniques (quantization, batching, compilation, runtime tuning)
System & Infrastructure Engineering
  • Support bare‑metal and virtualized environments for AI workloads
  • Troubleshoot issues across Linux OS, drivers, firmware, and networking stack
  • Support deployment and maintenance using Infrastructure as Code (IaC) and automation tools
  • Work with DCIM tools and monitoring systems for infrastructure visibility
  • Coordinate with hardware vendors for accelerator, server, and networking issues
Monitoring, Observability & Automation
  • Implement and manage monitoring systems (logs, metrics, traces)
  • Build dashboards for uptime, SLA adherence, performance, and utilization
  • Automate repetitive operational tasks using scripts and workflows
  • Establish and enforce runbooks and standard operating procedures (SOPs)
Cross‑Functional Collaboration
  • Work closely with Customer Engineering, Product, Engineering, and Support teams
  • Provide structured feedback to engineering for product improvements and defect resolution
  • Support customer onboarding, deployment readiness, and operational handover
  • Participate in customer reviews, escalations, and technical deep dives
Required Qualifications
  • Bachelor’s degree in Computer Science, Computer Engineering, Electrical Engineering, or related field
  • 10–15+ years of experience in ML operations, systems engineering, or customer support engineering
  • Proven experience in customer‑facing technical roles with SLA‑driven support models
  • Strong experience with AI/ML inference workloads in production environments
  • Deep understanding of end‑to‑end ML inference pipelines
  • Hands‑on experience with Linux systems, system bring‑up, drivers, and debugging tools
  • Strong understanding of AI accelerator architecture and system bottlenecks
  • Experience with model deployment, optimization, and performance tuning
  • Experience with data center operations and rack‑scale deployments
  • Familiarity with bare‑metal, virtualization, and containerization technologies (Docker, Kubernetes)
  • Knowledge of networking concepts (TCP/IP, RDMA, storage systems)
  • Experience with cloud and hybrid environments
  • Experience with monitoring/observability tools (Prometheus, Grafana, ELK, etc.)
  • Strong skills in incident management, RCA, and production operations
  • Experience defining and tracking SLAs, KPIs, and operational metrics
  • Proficiency in Python, Bash, or similar scripting…
Position Requirements
10+ Years work experience
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary