Back to Search Start Over

Adapting Machine Translation Engines to the Needs of Cultural Heritage Metadata.

Authors :
Chatzitheodorou, Konstantinos
Kaldeli, Eirini
Isaac, Antoine
Scalia, Paolo
Grau Lacal, Carmen
Escrivá, MªÁngeles García
Source :
Information Technology & Libraries; Sep2024, Vol. 43 Issue 3, p1-17, 17p
Publication Year :
2024

Abstract

The Europeana digital library features cultural heritage collections from over 3,000 European institutions described in 37 languages. However, most textual metadata describe the records in a single language, the data providers' language. Improving Europeana's multilingual accessibility presents challenges due to the unique characteristics of cultural heritage metadata, often expressed in short phrases and using in-domain terminology. This work presents the EuropeanaTranslate project's approach and results, aimed at translating Europeana metadata records from 23 EU languages into English. Machine Translation engines were trained on a cleaned selection of bilingual and synthetic data from Europeana, including multilingual vocabularies and relevant cultural heritage repositories. Automatic translations were evaluated through standard metrics and human assessments by linguists and domain cultural heritage experts. The results showed significant improvements when compared to the generic engines used before the in-domain training as well as the eTranslation service for most languages. The EuropeanaTranslate engines have translated over 29 million metadata records on Europeana.eu. Additionally, the MT engines and training datasets are publicly available via the European Language Grid Catalogue and the ELRC-SHARE repository. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
07309295
Volume :
43
Issue :
3
Database :
Complementary Index
Journal :
Information Technology & Libraries
Publication Type :
Academic Journal
Accession number :
179871819
Full Text :
https://doi.org/10.5860/ital.v43i3.17247