×
Register Here to Apply for Jobs or Post Jobs. X

Systems Research Engineer - LLM Optimisation; vLLM​/TensorRT-LLM

Job in City of Edinburgh, Edinburgh, City of Edinburgh Area, EH1, Scotland, UK
Listing for: Project People
Full Time position
Listed on 2026-06-06
Job specializations:
  • IT/Tech
    AI Engineer, Machine Learning/ ML Engineer, Data Scientist
Salary/Wage Range or Industry Benchmark: 80000 - 100000 GBP Yearly GBP 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Systems Research Engineer - LLM Optimisation (vLLM / TensorRT-LLM)
Location: City of Edinburgh

Systems Research Engineer - LLM Optimisation (vLLM / Tensor

RT-LLM)

Permanent

Edinburgh City Centre (On-site 5 days), walking distance from local transport links

Salary :
Competitive and negotiable, generous benefits package

In an era where Large Language Models (LLMs) are rebuilding the foundational software stack, our client is at the forefront of reshaping how large-scale models are trained, served, and deployed. Operating at the intersection of advanced systems research and industrial-scale engineering, their Edinburgh-based team is driving new AI Infrastructure & Agentic Serving architectures.

This role is a unique opportunity to help define next-generation large-scale data centres and AI infrastructure systems, turning innovative system designs into deployable, real-world technologies.

We are seeking Systems Research Engineers with a deep passion for computer systems, distributed AI infrastructure, and performance optimization. These roles are ideal for recent PhD graduates or exceptional BSc/MSc engineers looking to build research-driven experience in Operating Systems, Distributed Systems, AI Model Serving, Machine learning infrastructure. You will work closely with architects to prototype and optimize the next generation of global AI clusters.

What

you will be doing
  • Distributed Systems Research & Development : Architect, implement, and evaluate distributed system components for emerging AI and data-centric workloads. Drive modular design and scalability across GPU, and NPU clusters, building highly efficient serving and scheduling systems.
  • Performance Optimization & Profiling : Conduct in-depth profiling and performance tuning of large-scale inference and data pipelines, focusing on KV cache management, heterogeneous memory scheduling, and high-throughput inference serving using frameworks like vLLM, Ray Serve, and modern PyTorch Distributed systems.
  • Scalable Model Serving Infrastructure : Develop and evaluate frameworks that enable efficient multi-tenant, low-latency, and fault-tolerant AI serving across distributed environments. Research and prototype new techniques for cache sharing, data locality, and resource orchestration and scheduling within AI clusters.
  • Research & Publications : Translate innovative research ideas into publishable contributions at leading venues (e.g., OSDI, NSDI, Euro Sys, SoCC, MLSys, NeurIPS, ICML, ICLR) while driving internal adoption of novel methods and architectures.
  • Cross-Team Collaboration : Communicate technical insights, research progress, and evaluation outcomes effectively to multidisciplinary stakeholders and global research teams.
What we are looking for
  • Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related field
  • Fresh PhD graduates in systems, distributed computing, or large-scale AI infrastructure are also welcome
  • At least 2 years of experience with LLM inference / serving framework optimization (vLLM / Ray Serve / Tensor

    RT-LLM / PyTorch)
  • Hands-on experience with distributed KV cache optimization
  • Familiarity with GPU and how they execute LLMs
  • Strong knowledge of distributed systems, operating systems, machine learning systems architecture, Inference serving, and AI Infrastructure.
  • Solid grounding in systems research methodology, distributed algorithms, and profiling tools.
  • Proficiency in C/C++, with additional experience in Python for research prototyping.
  • Team-oriented mindset with effective technical communication skills

If this sounds like a role you can take hold of, we would love to hear from you! To apply for this role, please send your CV to Maggie Kwong

#J-18808-Ljbffr
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary