×
Register Here to Apply for Jobs or Post Jobs. X

Sr. Machine Learning Engineer, Foundation Models - AI, Search & Platforms

Job in Seattle, King County, Washington, 98194, USA
Listing for: Apple
Full Time position
Listed on 2026-05-22
Job specializations:
  • Software Development
    AI Engineer, Machine Learning/ ML Engineer, Software Engineer, Data Scientist
Job Description & How to Apply Below
Position: Staff/Sr. Machine Learning Engineer, Foundation Models - AI, Search & Knowledge Platforms
** Role Number:*
* ** Summary*
* We are Foundation Model Inference Team, within AI, Search & Knowledge Platform Technologies organization. Our team is responsible to build Inference stack to power Apple Intelligence.
It builds frameworks, services and tools that power the largest Apple foundation models on servers. Our Infrastructure powers a wide gamut of services at Apple including Apple Search, Apple Music, Apple

TV, App Store, iMessages, Photos & Camera, Spotlight, Safari, Siri and upcoming ever exciting Apple products serving millions of queries every day with incredible low latencies, drawing every ounce of compute from our hardware. As part of this group, you will get a chance to bring Intelligence to billions of users across the world. You will have an opportunity to make difference in life of people by empowering them with AI.

You will have a chance to work on optimizing billions of parameter langauge and vision and speech models using state of the art technologies and make it run at scale of Apple.

** Description*
* Work along side Foundation Model Research team to optimize inference for cutting edge model architectures.
Work closely with product teams to build Production grade solutions to launch models serving millions of customers in real time.
Build tools to understand bottlenecks in Inference for different hardwares and use cases.
Mentor and guide engineers in the organization.

** Minimum Qualifications*
* + 5+ years of experience leading and driving complex, ambiguous projects.

+

Experience with LLM inference stack

+ Familiarity with GPU programming concepts using CUDA.

+ Familiarity with one of the popular ML Frameworks like Pytorch, Tensorflow.

+ Have experience with high throughput services particularly at supercomputing scale.

+ Proficient with running applications on Cloud (AWS / Azure or equivalent) using Kubernetes, Docker etc.

+ Familiar with one of the popular ML Frameworks like Pytorch, Tensorflow.

+ BS in Computer Science, Artificial Intelligence, Machine Learning, Information Retrieval, Data Science or related field

** Preferred Qualifications*
* + Proficient in building and maintaining systems written in modern languages (eg: Golang, Python)

+ Familiar with fundamental Deep Learning architectures such as Transformers, Encoder/Decoder models.

+ Familiarity with Nvidia Tensor

RT-LLM, vLLM, Deep Speed, Nvidia Triton Server etc.

+ Experience writing custom CUDA kernels using CUDA or OpenAI Triton.

+ MS in Computer Science, Artificial Intelligence, Machine Learning, Information Retrieval, Data Science or related field.
To View & Apply for jobs on this site that accept applications from your location or country, tap the button below to make a Search.
(If this job is in fact in your jurisdiction, then you may be using a Proxy or VPN to access this site, and to progress further, you should change your connectivity to another mobile device or PC).
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary