AI Inference Intern: Optimize Latency & Throughput
Job in
London, Greater London, W1B, England, UK
Listed on 2026-06-24
Listing for:
Perplexity AI
Full Time, Part Time, Apprenticeship/Internship
position Listed on 2026-06-24
Job specializations:
-
IT/Tech
AI Engineer (Applied/Software), Machine Learning/ ML Engineer
Job Description & How to Apply Below
Perplexity AI is looking for candidates to join their AI Inference team in London. This role involves maintaining and optimizing the inference engine for products while ensuring high latency and throughput across GPU clusters.
Applicants should be pursuing a Master's or PhD in Computer Science, have a strong engineering background, and experience with ML frameworks like Torch or JAX.
This position is available as a full-time or part-time internship program over 13 weeks in-person.
#J-18808-LjbffrNote that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
Search for further Jobs Here:
×