×
Register Here to Apply for Jobs or Post Jobs. X

AI Infrastructure Architect

Job in City of Edinburgh, Edinburgh, City of Edinburgh Area, EH1, Scotland, UK
Listing for: microTECH Global Limited
Full Time position
Listed on 2026-02-09
Job specializations:
  • IT/Tech
    AI Engineer, Systems Engineer
Job Description & How to Apply Below
Location: City of Edinburgh

Job Title:

AI Infrastructure Architect

Location:

Edinburgh, Scotland

Type:
Permanent

On-Site Working Required, No Sponsorship Provided

Responsibilities

Design a unified AI Infra & Serving architecture platform for composite AI workloads such as LLM Training & Inference, RLHF, Agent, and Multimodal processing. This platform will integrate inference, orchestration, and state management, defining the technical evolution path for Serverless AI + Agentic Serving

Design a heterogeneous execution framework across CPU/GPU/NPU for agent memory, tool invocation, and long-running multi-turn conversations and tasks. Build an efficient memory/KV-cache/vector store/logging and state-management subsystem to support agent retrieval, planning, and persistent memory.

Build a high-performance Runtime/Framework that defines the next-generation Serverless AI foundation through elastic scaling, cold start optimization, batch processing, function-based inference, request orchestration, dynamic decoupled deployment, and other features to support performance scenarios such as multiple models, multi-tenancy, and high concurrency.

Key Requirements
  • Strong foundational knowledge in system architecture, or computer architecture, operating systems, and runtime environments;
  • Hands-on experience with Serverless architectures and cloud-native optimization technologies such as containers, Kubernetes, service orchestration, and autoscaling
  • vLLM, SGLang, Ray Serve, etc.); understand common optimization concepts such as continuous batching, KV-Cache reuse, parallelism, and compression/quantization/distillation
  • Proficient in using Profiling/Tracing tools; experienced in analyzing and optimizing system-level bottlenecks regarding GPU utilization, memory/bandwidth, Interconnect Fabric, and network/storage paths
  • Proficient in at least one system-level language (e.g., C/C++, Go, Rust) and one scripting language (e.g., Python)

If you're interested in applying, please reach out to

#J-18808-Ljbffr
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary