×
Register Here to Apply for Jobs or Post Jobs. X

Machine Learning Systems Engineer

Job in Manchester, Greater Manchester, M9, England, UK
Listing for: ConnexAI
Full Time position
Listed on 2026-05-16
Job specializations:
  • IT/Tech
    AI Software Engineer, Systems Engineer, Machine Learning/ ML Engineer
Salary/Wage Range or Industry Benchmark: 125000 - 150000 GBP Yearly GBP 125000.00 150000.00 YEAR
Job Description & How to Apply Below

We are building real-time conversational AI systems for contact centres, powered by ASR, LLMs, and TTS.

As an LLM Systems Engineer, you will sit within our LLM team and focus on the systems layer that makes production Conversational AI work ’ll design and improve the infrastructure, orchestration, and runtime systems behind low-latency conversational AI workflows.

This role focuses on solving the technical challenges associated with delivering real-time AI conversations: coordinating complex AI systems under strict latency and reliability constraints.

What you’ll do
  • Design and build systems that enable LLM workflows to maintain real-time responses even under peak load
  • Improve latency, throughput, concurrency, and reliability across our production systems
  • Build orchestration logic for model calls, services, queues, retries, fallbacks, and routing that balances load management with low response times
  • Help scale systems to support high volumes of concurrent real-time conversations
  • Optimise memory usage and resource efficiency across LLM-powered services
  • Deploy and support autoscaling in AI services running in AWS-based systems
  • Build observability into AI workflows, including monitoring, logging, alerting, and performance tracking
  • Work closely with data scientists, MLEs, prototype engineers, and backend engineers
  • Help turn LLM capabilities into stable, scalable production Conversational AI systems
What we’re looking for
  • Experience building production backend systems, distributed systems, or ML infrastructure
  • Strong understanding of scalability, latency, reliability, and performance engineering
  • Experience with cloud infrastructure, ideally AWS
  • Experience working with APIs, queues, service orchestration, and production monitoring
  • Understanding of how LLMs are used in production systems
  • Ability to reason about concurrency, throughput, memory usage, and failure handling
Nice to have
  • Experience with conversational AI, voice systems, ASR, TTS, or real-time streaming systems
  • Experience with model serving or inference infrastructure
  • Exposure to open-source LLMs or LLM orchestration frameworks
  • Experience with Docker, Kubernetes, ECS, or similar container orchestration tools
  • Experience with Redis, Kafka, Kinesis, SQS, or similar queueing/event systems
  • Familiarity with monitoring tools such as Cloud Watch, Prometheus, or Grafana

You’ll help build the systems behind real-time AI conversations used in production contact centre environments. This is a high-impact engineering role focused on low latency, scalability, reliability, and making LLM-powered systems work under real-world load.

#J-18808-Ljbffr
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)
0
200
Filters
Education Level
Experience Level (years)
Posted in last:
Salary