Back to Search Start Over

Backretrieval: An Image-Pivoted Evaluation Metric for Cross-Lingual Text Representations Without Parallel Corpora

Authors :
Niall Twomey
Mikhail Fain
Danushka Bollegala
Diaz, Fernando
Shah, Chirag
Suel, Torsten
Castells, Pablo
Jones, Rosie
Sakai, Tetsuya
Source :
SIGIR
Publication Year :
2021
Publisher :
ACM, 2021.

Abstract

Cross-lingual text representations have gained popularity lately and act as the backbone of many tasks such as unsupervised machine translation and cross-lingual information retrieval, to name a few. However, evaluation of such representations is difficult in the domains beyond standard benchmarks due to the necessity of obtaining domain-specific parallel language data across different pairs of languages. In this paper, we propose an automatic metric for evaluating the quality of cross-lingual textual representations using images as a proxy in a paired image-text evaluation dataset. Experimentally, Backretrieval is shown to highly correlate with ground truth metrics on annotated datasets, and our analysis shows statistically significant improvements over baselines. Our experiments conclude with a case study on a recipe dataset without parallel cross-lingual data. We illustrate how to judge cross-lingual embedding quality with Backretrieval, and validate the outcome with a small human study.<br />Comment: SIGIR 2021

Details

Language :
English
Database :
OpenAIRE
Journal :
SIGIR
Accession number :
edsair.doi.dedup.....161f7cfe9ab205c9552317af8c40dd89