AI Engineer; Prompt Engineering & Evaluation or Greece Job London area,Greater London England UK,IT/Tech

Position: AI Engineer (Prompt Engineering & Evaluation) Permanent or Greece
Location: Greater London

Role: AI Engineer (Prompt Engineering & Evaluation)

Role type:
Permanent

Location:

UK or Greece

Preferred start date: ASAP

LIFE AT SATALIA

As an organisation, we push the boundaries of data science, and artificial intelligence to solve the most complex problems in the industry.

Satalia, a WPP company is a community of individuals devoted to working on diverse and challenging projects, allowing you to flex your technical skills whilst working with a tight‑knit team of high performing colleagues.

Led by our founder and WPP Chief AI Officer Daniel Hulme, Satalia’s ambition is to become a decentralised organisation of the future. Today, this involves developing tools and processes to liberate and automate manual repetitive tasks, with a focus on freedom, transparency and trust. At the core of our thinking is an approach to wellbeing and inclusivity. We unpack human behaviour and unpick prejudice to ensure a safe and inviting environment.

We offer truly flexible working and allow our employees to find the working practice that makes them most productive. At Satalia, your opinion matters and your achievements are celebrated.

THE ROLE

We are investing massively in developing next-generation AI tools for multimodal datasets and a wide range of applications. We are building large scale, enterprise grade solutions and serving these innovations to our clients and WPP agency partners. As a member of our team, you will work alongside world‑class talent in an environment that not only fosters innovation but also personal growth.

You will be responsible for shaping how our AI models understand and interpret complex creative content. Your work will bridge the gap between raw AI capabilities and the delivery of accurate, reliable, and valuable insights for our FTSE 100 clients. You will be at the forefront of applied AI, ensuring our products are not just powerful, but also precise and trustworthy.

YOUR

RESPONSIBILITIES

Collaboration: Work closely with product managers, data scientists, and architects to translate business needs into technical requirements for AI evaluation and application.
Prompt Engineering & LLM Application: Design, develop, and iteratively refine sophisticated prompts for Large Language Models (LLMs).
AI Output Evaluation & Governance: Design and implement evaluation frameworks to ensure our LLM‑based services deliver accurate, reliable outputs. Establish metrics for prompt performance, iterative improvement processes, and drift monitoring. Evaluate and recommend best‑in‑class evaluation tools and methodologies to enhance our capabilities.
Evaluation Dataset Engineering: Build and maintain the infrastructure for high‑quality evaluation datasets representing diverse industry verticals and creative types. This involves designing data pipelines, annotation workflows, quality control systems, and version control. You'll develop intelligent sampling strategies to ensure balanced positive/negative examples across questions about creative elements that drive advertising performance.
Applied AI Research & Integration: Build a production‑ready evaluation system for our LLM‑based services, which extract structured insights from advertising creatives. In your first 6 months, you'll develop an automated evaluation system using LLM‑as‑judge approaches with human‑in‑the‑loop validation, creating a robust, scalable solution for FTSE 100 clients.
Documentation & Knowledge Sharing: Document the AI evaluation process and results to track product quality and maintain user‑facing product and API documentation.

MINIMUM QUALIFICATIONS / SKILLS

Proficiency in Python.
Hands‑on experience in advanced prompt engineering for major LLMs.
Proven experience in designing and implementing evaluation methodologies and quality frameworks for AI/ML model outputs.
Familiarity with modern AI/LLM frameworks (e.g., Lang Chain, Google GenAI).
Experience working with both structured and unstructured data, particularly in a cloud environment (GCP, AWS, or Azure).
Strong analytical and problem‑solving skills.

PREFERRED QUALIFICATIONS / SKILLS

Experience with Computer Vision APIs or models.
Experience building and deploying scalable API services (e.g.,…


Increase/decrease your Search Radius (miles)



Job Posting Language