AI Data Engineer Job Edinburgh area,City of Edinburgh Scotland UK,IT/Tech

Location: City of Edinburgh

Join to apply for the AI Data Engineer role at Lenovo

The Lenovo AI Technology Center (LATC)—Lenovo’s global AI Center of Excellence—is driving our transformation into an AI-first organization. We are assembling a world‑class team of researchers, engineers, and innovators to position Lenovo and its customers at the forefront of the generational shift toward AI. Lenovo is one of the world’s leading computing companies, delivering products across the entire technology spectrum, spanning wearables, smartphones (Motorola), laptops (Think Pad, Yoga), PCs, workstations, servers, and services/solutions.

This unmatched breadth gives us a unique canvas for AI innovation, including the ability to rapidly deploy cutting‑edge foundation models and to enable flexible, hybrid‑cloud, and agentic computing across our full product portfolio. To this end, we are building the next wave of AI core technologies and platforms that leverage and evolve with the fast‑moving AI ecosystem, including novel model and agentic orchestration & collaboration across mobile, edge, and cloud resources.

This space is evolving fast and so are we.

If you’re ready to shape AI at a truly global scale, with products that touch every corner of life and work, there’s no better time to join us.

Lenovo is seeking a talented and motivated Data Engineer/Scientist to join our growing team. This role is critical to the success of our machine learning initiatives, focusing on the creation, quality control, and governance of the datasets that power our models. You will bridge the gap between raw data and model readiness, working closely with model developers to understand their needs and deliver high‑quality, reliable data.

This is a hands‑on role requiring strong technical skills in data engineering, data analysis, and machine learning fundamentals. If you are passionate about making Smarter Technology For All, come help us realize our Hybrid AI vision!

Responsibilities

Data Creation & Annotation:
Design, build, and implement processes for creating task‑specific training datasets. This may include data labeling, annotation, and data augmentation techniques.
Data Pipeline Development:
Leverage tools and technologies to accelerate dataset creation and improvement. This includes scripting, automation, and potentially working with data labeling platforms.
Data Quality & Evaluation:
Perform thorough data analysis to assess data quality, identify anomalies, and ensure data integrity. Utilize machine learning tools and techniques to evaluate dataset performance and identify areas for improvement.
Big Data Technologies:
Utilize database systems (SQL and No

SQL) and big data tools (e.g., Spark, Hadoop, cloud‑based data warehouses like Snowflake/Redshift/Big Query) to process, transform, and store large datasets.
Data Governance & Lineage:
Implement and maintain data governance best practices, including data source tracking, data lineage documentation, and license management. Ensure compliance with data privacy regulations.
Collaboration with Model Developers:
Work closely with machine learning engineers and data scientists to understand their data requirements, provide clean and well‑documented datasets, and iterate on data solutions based on model performance feedback.
Documentation:
Create and maintain clear and concise documentation for data pipelines, data quality checks, and data governance procedures.
Stay Current:
Keep up‑to‑date with the latest advancements in data engineering, machine learning, and data governance.

Qualifications

Education:

Bachelor’s or Master's degree in Computer Science, Computer Engineering, Electrical Engineering, Statistics, Mathematics, or a related field.
Experience:

8+ years of experience in a data engineering or data science role.
Programming

Skills:

Proficiency in Python and SQL. Experience with other languages (e.g., Java, Scala) is a plus.
Database

Skills:

Strong experience with relational databases (e.g., Postgre

SQL, MySQL) and No

SQL databases (e.g., Mongo

DB, Cassandra).
Big Data Tools:
Experience with big data technologies such as Spark, Hadoop, or cloud‑based data warehousing solutions (Snowflake, Redshift, Big Query).
Data…


Increase/decrease your Search Radius (miles)



Job Posting Language