LLM Serving Engineer; Cloud AI Engineering), Senior/Staff Engineer
Listed on 2026-06-23
-
Engineering
AI Engineer (Applied/Software), Software Engineer
Company
Qualcomm Technologies, Inc.
Job AreaEngineering Group, Engineering Group >
Machine Learning Engineering
LLM Serving Engineer (Cloud AI Engineering)
Qualcomm is utilizing its traditional strengths in digital wireless technologies to play a central role in the evolution of Cloud AI. We are investing in several supporting technologies including Deep Learning. The Qualcomm Cloud AI team is developing hardware and software solutions for Inference Acceleration.
We are hiring LLM Serving Engineers at multiple levels to join our dynamic, collaborative team. This role spans the full product lifecycle—from cutting-edge research and development to commercial deployment—and demands strategic thinking, strong execution, and excellent communication skills.
Role Activities- Building a scalable LLM inference platform using inference techniques (e.g. disaggregated serving and KV-Cache management, advanced parallelism, speculative algorithms, model optimization, specialized kernels).
- Contribute to the development of LLM Serving packages (e.g. vLLM, SGLang, TGI, Triton-Inference Server, Dynamo, LLM-d).
- Work closely with customers to drive solutions by collaborating with internal compiler, firmware and platform teams.
- Work at the forefront of GenAI by understanding advanced algorithms (e.g. attention mechanisms, MoEs) and numerics to identify new optimization opportunities.
- Drive efficient serving through autoscaling, load balancing and routing.
- Engage with open-source serving communities to evolve the framework.
- Hands-on experience in one or more of the following LLM serving/orchestration packages (Triton-Inference Server, vLLM, SGLang, Ollama, llm-d, KServe, LMCache, Moon Cake).
- Deep understanding of foundational LLMs, VLMs, SLMs, transformer-based architectures.
- Strong experience in developing language models using PyTorch.
- Strong computer science fundamentals - algorithms, data structures, parallel and distributed programming.
- Understanding of computer architecture, ML accelerators, in-memory processing and distributed systems.
- Strong Python development skills for large-scale projects and a passion for software engineering.
- Experience in analyzing, profiling, and optimizing deep learning workloads.
- Proactive learning about the latest inference optimization techniques.
- Excellent communication and problem-solving skills, with the ability to thrive in a fast-paced and collaborative environment.
- MS or BS in Computer Science, Machine Learning, Computer Engineering or Electrical Engineering.
- Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field and 4+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
- OR Master’s degree in Computer Science, Engineering, Information Systems, or related field and 3+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
- OR PhD in Computer Science, Engineering, Information Systems, or related field and 2+ years of Hardware Engineering, Software Engineering, Systems Engineering, or related work experience.
Qualcomm is an equal opportunity employer. If you are an individual with a disability and need an accommodation during the application/hiring process, Qualcomm is committed to providing an accessible process. You may email disabili or call Qualcomm’s toll-free number found here. Upon request, Qualcomm will provide reasonable accommodations to support individuals with disabilities to participate in the hiring process. Qualcomm is also committed to making our workplace accessible for individuals with disabilities.
EEO
Employer:
Qualcomm is an equal opportunity employer; all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or any other protected classification.
$ - $
The above pay scale reflects the broad, minimum to maximum, pay scale for this job code for the location for which it has been posted. Salary is only one component of total compensation offer a competitive annual discretionary bonus program and potential RSU grants. Our benefits package supports employees at work, at home, and r recruiter can discuss details about Qualcomm benefits.
If you would like more information about this role, please contact Qualcomm Careers.
#J-18808-Ljbffr(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).