×
Register Here to Apply for Jobs or Post Jobs. X

AI Inference Internship

Job in Greater London, London, Greater London, W1B, England, UK
Listing for: Pantera Capital
Full Time, Part Time, Apprenticeship/Internship position
Listed on 2026-06-20
Job specializations:
  • Software Development
    AI Engineer (Applied/Software), Data Scientist, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 10000 - 40000 GBP Yearly GBP 10000.00 40000.00 YEAR
Job Description & How to Apply Below
Location: Greater London

Perplexity is excited to announce the Internship Program for exceptional Master’s or PhD students studying Computer Science or Engineering in the UK, enrolled in the  academic year. This is an intensive program in which you will work directly with our AI Inference team. This program offers a unique opportunity to gain valuable experience in a rapidly growing AI startup. Outstanding performers might be offered a full time position at the end of the program.

Our AI Inference team is responsible for running the models behind the Perplexity products. The team maintains the inference engine and deployments behind models ranging from single-node embeddings to distributed sparse Mixture-of-Experts models, maintaining large GPU clusters. With a keen focus on latency and throughput, the Inference team is responsible for the entire serving stack, from GPU kernels to networking and monitoring infrastructure.

Responsibilities
  • Work with the inference team to improve serving latency and throughput
  • Bring up support for new models and state-of-the-art inference optimizations or quantization schemes
  • Optimize inference across the entire stack, from GPU kernels to serving endpoints
Qualifications
  • Strong engineering track record with proven knowledge of fundamentals and programming languages (multi-threaded programming, networking, compilation, systems programming, etc)
  • Pursuing a Master’s or PhD in Computer Science with a focus on performance-related subjects (HPC, Compilers, Distributed Systems)
  • Experience with ML frameworks (Torch, JAX)
  • Experience with GPU programming (CUDA, Triton)
  • Experience with High-Performance Computing (OpenMPI)
Schedule
  • Internship program: 13 weeks, full-time or part-time, in-person in London office (hybrid schedule: 3 days from the office, 2 days WFH)
Interview Process
  • Fill out the application on Perplexity website
  • If selected, People Ops and technical interviews will be involved.
  • Offer. We’re impressed! We’d love to welcome you to our Internship program!
  • Start. We have a desk waiting for you in our London office!
FAQ

Do you sponsor visas? Can I apply if I need a visa to work in the UK?

Unfortunately we are unable to sponsor visas

What if I’m on a student visa?

You need to seek approval from your University (to determine if you are eligible to work full time or part time only)

How many internship spots are there?

  • We have spots for 2-3 interns in our 2026 class.

Is housing provided?

  • Unfortunately we cannot provide housing.

Is health insurance provided?

  • Unfortunately we cannot provide health insurance for interns. Full time employees receive full health insurance and benefits.

How many full time offers are available at the end of the residency?

  • There is no limit. All outstanding performers will be given a full time offer!
#J-18808-Ljbffr
Position Requirements
Less than 1 Year work experience
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary