×
Register Here to Apply for Jobs or Post Jobs. X

Language engineer

Job in 1000, Amsterdam, North Holland, Netherlands
Listing for: NLP PEOPLE
Full Time position
Listed on 2026-01-09
Job specializations:
  • IT/Tech
    Data Scientist, Data Analyst
Salary/Wage Range or Industry Benchmark: 60000 - 80000 EUR Yearly EUR 60000.00 80000.00 YEAR
Job Description & How to Apply Below

We are a dynamic and innovative small-sized SaaS company specializing in language data products and services. We are a team of 17, distributed across two offices in Amsterdam and Thessaloniki.

About the Project

TAUS is executing technical work streams for the European Commission’s BEACON project, focused on collecting, curating, and publishing high-quality parallel text corpora for machine translation in EU candidate country languages. This 9-month project involves processing hundreds of millions of sentences from diverse sources, applying rigorous quality assurance frameworks, and preparing publication-ready datasets for seven language pairs:
English paired with Ukrainian, Serbian, Bosnian, Macedonian, Albanian, Montenegrin, and Romanian/Moldovan, with particular focus on legal and administrative domains.

Position Overview

We seek a skilled and motivated Language Data Engineer to join our technical team for large-scale parallel corpus collection, processing, and quality assurance. You will work hands‑on with real‑world challenges in low‑resource language processing, quality assurance at scale, and contribute directly to expanding Europe’s multilingual digital infrastructure.

Responsibilities
  • Download and catalog parallel corpora from public repositories and implement targeted web crawling for legal/administrative domain content.
  • Extract text from diverse formats (PDFs, HTML, document archives) and apply bilingual as well as monolingual corpus mining techniques.
  • Document source provenance, licensing, and metadata comprehensively.
  • Execute preprocessing pipelines: format normalization, sentence segmentation, alignment, language identification, and quality filtering.
  • Handle large‑scale data processing with deduplication and anonymization.
  • Maintain detailed processing logs and quality metrics throughout the pipeline.
  • Validate NLP tool performance across seven language pairs and implement automated quality checks (alignment confidence, language , domain classification).
  • Coordinate with linguists for human validation and generate quality reports with statistical metrics.
  • Troubleshoot and resolve quality issues in processing workflows.
  • Contribute to technical deliverables and project documentation meeting EC standards.
  • Collaborate with European Commission experts and cross‑functional teams on methodology and quality criteria.
  • Ensure compliance with EU data governance, GDPR, and licensing requirements.
Company

TAUS

Qualifications
  • 3+ years of work experience with Natural Language Processing (NLP)
  • 3+ years of work experience with Python (Programming Language)
Specific requirements
  • Authorized to work in Yes
Level of experience (years)

Mid Career (2+ years of experience)

Tagged as:
  • Classification
  • Industry
  • Machine Translation
  • Natural Language Processing
  • Netherlands
  • NLP
#J-18808-Ljbffr
Note that applications are not being accepted from your jurisdiction for this job currently via this jobsite. Candidate preferences are the decision of the Employer or Recruiting Agent, and are controlled by them alone.
To Search, View & Apply for jobs on this site that accept applications from your location or country, tap here to make a Search:
 
 
 
Search for further Jobs Here:
(Try combinations for better Results! Or enter less keywords for broader Results)
Location
Increase/decrease your Search Radius (miles)

Job Posting Language
Employment Category
Education (minimum level)
Filters
Education Level
Experience Level (years)
Posted in last:
Salary