Inference Optimization Engineer; local/edge runtime
Job in
Phoenix, Maricopa County, Arizona, 85003, USA
Listed on 2026-06-24
Listing for:
Intel
Full Time
position Listed on 2026-06-24
Job specializations:
-
Software Development
AI Engineer (Applied/Software), C++ Developer, Python, Software Engineer
Job Description & How to Apply Below
What You'll Do
- Profile and optimize local inference (llama.cpp-vulkan and vLLM) for latency, throughput, and memory on edge hardware
- Tune KV cache, continuous batching, and scheduling for interactive agent workloads
- Drive quantization strategy (GGUF / AWQ / GPTQ) and validate quality impact with the Post-Training team
- Cut CPU overhead and improve engine startup, model load, and lifecycle (start / stop / health)
- Benchmark across hardware tiers and publish honest performance comparisons
- Upstream fixes and patches to open‑source engines where it helps us
- The internals of modern inference engines and where the milliseconds actually go
- Hardware‑aware optimization across iGPU / CPU paths (Vulkan, SYCL, oneAPI, CUDA where relevant)
- The quality‑vs‑speed‑vs‑memory trade space for small models
- Interest in local / edge AI and squeezing hardware
- BS/MS in CS, EE, Math or related STEM field
- 5+ years software development background
- Strong in C++ and/or Python; comfortable reading systems‑level code
- Understands how LLM inference works (attention, KV cache, decoding)
- Has profiled and optimized real performance problems (CPU or GPU) and can prove the speedup
- Linux, build systems, and low‑level debugging expertise
- Hands‑on with llama.cpp, vLLM, ggml, or similar engines
- Experience with GPU / accelerator programming (Vulkan, CUDA, SYCL, Metal) or SIMD / CPU kernels
- Familiarity with quantization formats and their quality trade‑offs
- Open‑source contributions to inference engines
US: $ - USD
Work ModelThis role will be eligible for a hybrid work model allowing employees to split their time between working on‑site at their assigned Intel site and off‑site.
EEO StatementAll qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.
#J-18808-LjbffrTo View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
Search for further Jobs Here:
×