Back to Search Start Over

MultiLexBATS: Multilingual Dataset of Lexical Semantic Relations

Authors :
Calzolari, N
Kan, MY
Hoste, V Lenci, A Sakti, S
Xue, N
Gromann, D
Goncalo Oliveira, H
Pitarch, L
Apostol, E
Bernad, J
Bytyçi, E
Cantone, C
Carvalho, S
Frontini, F
Garabik, R
Gracia, J
Granata, L
Khan, F
Knez, T
Labropoulou, P
Liebeskind, C
Pia di Buono, M
Ostroški Anić, A
Rackevičienė, S
Rodrigues, R
Sérasset, G
Selmistraitis, L
Sidibé, M
Silvano, P
Spahiu, B
Sogutlu, E
Stanković, R
Truică, C
Valūnaitė Oleškevičienė, G
Zitnik, S
Zdravkova, K
Dagmar Gromann
Hugo Goncalo Oliveira
Lucia Pitarch
Elena-Simona Apostol
Jordi Bernad
Eliot Bytyçi
Chiara Cantone
Sara Carvalho
Francesca Frontini
Radovan Garabik
Jorge Gracia
Letizia Granata
Fahad Khan
Timotej Knez
Penny Labropoulou
Chaya Liebeskind
Maria Pia di Buono
Ana Ostroški Anić
Sigita Rackevičienė
Ricardo Rodrigues
Gilles Sérasset
Linas Selmistraitis
Mahammadou Sidibé
Purificação Silvano
Blerina Spahiu
Enriketa Sogutlu
Ranka Stanković
Ciprian-Octavian Truică
Giedrė Valūnaitė Oleškevičienė
Slavko Zitnik
Katerina Zdravkova
Calzolari, N
Kan, MY
Hoste, V Lenci, A Sakti, S
Xue, N
Gromann, D
Goncalo Oliveira, H
Pitarch, L
Apostol, E
Bernad, J
Bytyçi, E
Cantone, C
Carvalho, S
Frontini, F
Garabik, R
Gracia, J
Granata, L
Khan, F
Knez, T
Labropoulou, P
Liebeskind, C
Pia di Buono, M
Ostroški Anić, A
Rackevičienė, S
Rodrigues, R
Sérasset, G
Selmistraitis, L
Sidibé, M
Silvano, P
Spahiu, B
Sogutlu, E
Stanković, R
Truică, C
Valūnaitė Oleškevičienė, G
Zitnik, S
Zdravkova, K
Dagmar Gromann
Hugo Goncalo Oliveira
Lucia Pitarch
Elena-Simona Apostol
Jordi Bernad
Eliot Bytyçi
Chiara Cantone
Sara Carvalho
Francesca Frontini
Radovan Garabik
Jorge Gracia
Letizia Granata
Fahad Khan
Timotej Knez
Penny Labropoulou
Chaya Liebeskind
Maria Pia di Buono
Ana Ostroški Anić
Sigita Rackevičienė
Ricardo Rodrigues
Gilles Sérasset
Linas Selmistraitis
Mahammadou Sidibé
Purificação Silvano
Blerina Spahiu
Enriketa Sogutlu
Ranka Stanković
Ciprian-Octavian Truică
Giedrė Valūnaitė Oleškevičienė
Slavko Zitnik
Katerina Zdravkova
Publication Year :
2024

Abstract

Understanding the relation between the meanings of words is an important part of comprehending natural language. Prior work has either focused on analysing lexical semantic relations in word embeddings or probing pretrained language models (PLMs), with some exceptions. Given the rarity of highly multilingual benchmarks, it is unclear to what extent PLMs capture relational knowledge and are able to transfer it across languages. To start addressing this question, we propose MultiLexBATS, a multilingual parallel dataset of lexical semantic relations adapted from BATS in 15 languages including low-resource languages, such as Bambara, Lithuanian, and Albanian. As experiment on cross-lingual transfer of relational knowledge, we test the PLMs' ability to (1) capture analogies across languages, and (2) predict translation targets. We find considerable differences across relation types and languages with a clear preference for hypernymy and antonymy as well as romance languages.

Details

Database :
OAIster
Notes :
English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1440494053
Document Type :
Electronic Resource