×
Register Here to Apply for Jobs or Post Jobs. X

Research Engineer​/Scientist - Machine Learning RL & Optimisation; Contractor

Job in Greater London, London, Greater London, W1B, England, UK
Listing for: Huawei Technologies Research & Development (UK) Ltd
Contract position
Listed on 2026-06-20
Job specializations:
  • IT/Tech
    AI Engineer (Applied/Software), Machine Learning/ ML Engineer, Data Scientist
Salary/Wage Range or Industry Benchmark: 80000 - 100000 GBP Yearly GBP 80000.00 100000.00 YEAR
Job Description & How to Apply Below
Position: Research Engineer/Scientist - Machine Learning RL & Optimisation (Contractor)
Location: Greater London

About Huawei Research and Development UK Limited

Founded in 1987, Huawei is a leading global provider of information and communications technology (ICT) infrastructure and smart devices. We have 207,000 employees and operate in over 170 countries and regions, serving more than three billion people around the world.

Our vision and mission is to bring digital to every person, home and organization for a fully connected, intelligent world. To this end, we will drive ubiquitous connectivity and promote equal access to networks; bring cloud and artificial intelligence to all four corners of the earth to provide superior computing power where you need it, when you need it; build digital platforms to help all industries and organizations become more agile, efficient, and dynamic;

redefine user experience with AI, making it more personalized for people in all aspects of their life, whether they’re at home, in the office, or on the go.

This spirit of innovation has led Huawei to work in close partnership with leading academic institutions in the UK to develop and refine the latest technologies. With a shared commitment to innovation and progress, both parties have worked together to achieve common goals and establish a strong partnership. The partnership between UK and Huawei help to develop the technologies of the future that will transform the way we all communicate, work and live.

For the past 30 years we have maintained an unwavering focus, rejecting shortcuts and easy opportunities that don't align with our core business. With a practical approach to everything we do, we concentrate our efforts and invest patiently to drive technological breakthroughs.

This strategic focus is a reflection of our core values:
  • Staying customer-centric,

  • Inspiring dedication,

  • Persevering,

  • Growing by reflection.

Huawei Research and Development UK Limited Overview

Huawei’s vision is a fully connected, intelligent world. To achieve this, we work to inspire passion for basic research around the world. Our combined passion drives development across the global innovation value chain. Huawei has the largest Research and Development organization in the world with 96,000+ employees in research centers around the globe. In the UK, we already have design centers in Cambridge, London, Edinburgh and Ipswich.

We continue to explore and define new research directions and new services. We have expanded our collaborations with academic researchers; researched new network architectures, integration of communications and key enabling technologies; and developed the fundamental theories of these technologies. We invite you to join us on this exciting journey and drive your career forward.

Job Summary

Research and develop large-scale machine learning systems, alignment workflows, and optimization infrastructure to advance LLM reasoning and post-training capabilities. Design and execute scaled reinforcement learning pipelines (e.g., PPO, GRPO) utilizing distributed training frameworks (verl, trl, Deep Speed, FSDP) integrated with high-performance inference engines (vLLM). Optimize low-level training throughput, kernel performance, and memory utilization across heterogeneous hardware clusters using expressive hardware DSLs (e.g., Tile Lang, Triton).

Advance the LLM orchestration loop and leverage Bayesian optimization to automate the search, generation, and continuous improvement of high-performance NPU kernels.

Key Responsibilities
  • Design and execute scaled RL fine tuning workflows (e.g., PPO, GRPO) to enhance LLM reasoning, instruction-following, and alignment.

  • Architect and manage large-scale distributed training experiments across multi-node GPU, optimizing for maximum throughput and hardware utilization.

  • Develop and maintain training infrastructure using advanced parallelization frameworks (verl, trl, Deep Speed, FSDP) to support rapidly evolving research needs.

  • Integrate high-performance inference engines like vLLM directly into RL generation loops to reduce rollout latency and accelerate training cycles.

  • Implement robust profiling and debugging pipelines to diagnose bottlenecks in GPU memory, compute, and inter-node communication.

  • Collaborate with…

Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary