AI Engineer,LLM Systems & Agentic Workflows Job Aberdeen area,Scotland UK,IT/Tech

AI Engineer, LLM Systems & Agentic Workflows

As an AI Engineer focused on LLM Systems, your primary mandate is to design, build, and operate the AI layer that powers intelligent automation across the Command Link platform. You'll be working at the engineering layer of agentic AI: building durable, production‑grade LLM workflows on top of Temporal, implementing security and policy controls around LLM execution, and solving hard problems around prompt injection, output trust, and runtime governance in domain‑specific contexts.

You’ll work closely with Engineering and Product leads to build agentic workflows to execute deterministic workflows for context aware insights, triage, investigations and remediation into reliable, observable, and policy‑compliant AI workflows. That means designing for failure, latency, and adversarial inputs from day one, not retrofitting safety controls after the fact. The space is moving fast, the problems are genuinely unsolved, and we’re looking for someone who has strong opinions about how to build AI systems that are trustworthy in production.

Key Responsibilities:

Agentic workflow engineering: design and build multi‑step LLM workflows using Temporal as the durable orchestration backbone; handling retries, state, parallelism, human‑in‑the‑loop steps, and long‑running agent execution.
Domain‑specific automation: work with subject matter experts to identify, scope, and implement AI‑driven automation for specific business and operational domains; own the full delivery from prototype to production.
LLM security and policy enforcement: implement runtime policy controls around LLM execution, including prompt injection mitigation, output validation, privilege separation (dual‑LLM / quarantined execution patterns), and integration with policy engines.
Parallel and live evaluation: build evaluation frameworks to assess LLM output quality in parallel with production traffic; implement continuous evals, regression detection, and automated quality gates.
Prompt injection defense: apply and adapt state‑of‑the‑art design patterns including the Dual LLM, Plan‑Then‑Execute, and Code‑Then‑Execute patterns to harden agent pipelines against adversarial inputs.
Policy engine integration: integrate tools such as Sequrity.ai to define, enforce, and audit natural‑language security policies over LLM tool use and execution paths.
Observability and auditability: instrument AI workflows with full event history, structured logging of prompts and completions, cost tracking, and latency profiling making the behaviour of AI systems traceable and debuggable.
LLM steering and control: implement output steering strategies, structured generation, constrained decoding, and fallback routing to ensure models behave within defined operational envelopes.
Collaborate on architecture: work across the engineering team to define standards for how AI capabilities are integrated into the product setting patterns others will follow.

What You’ll Need for Success:

Experience with complex and large datasets.
2+ years building production LLM‑powered applications beyond RAG prototypes; real systems handling real failure modes.
Hands‑on experience with Temporal (or equivalent durable execution platforms such as Cadence or Conductor) for orchestrating multi‑step, long‑running AI workflows.
Deep understanding of prompt injection attack vectors, mitigation strategies, and the trade‑offs between defense patterns (Dual LLM, CaMeL / Code‑Then‑Execute, Action‑Selector, context minimization).
Experience implementing policy controls and guardrails around LLM execution RBAC/PBAC for agents, output filtering, semantic validation, and tool‑use restrictions.
Practical experience building parallel evaluation pipelines for LLM outputs live evals, shadow scoring, regression suites, and automated quality gates.
Strong software engineering fundamentals. You write maintainable, testable code; experience in Python and/or Go preferred.
Familiarity with LLM APIs and inference providers (OpenAI, Anthropic, Mistral, or open‑weight models via vLLM / Ollama).
Understanding of agentic architecture patterns: tool use, multi‑agent delegation, structured outputs,…