Back to Search Start Over

MammoTab: a giant and comprehensive dataset for Semantic Table Interpretation

Authors :
Efthymiou, V
Jiménez-Ruiz, E
Chen, J
Cutrona, V
Hassanzadeh, O
Sequeda, J
Srinivas, K
Abdelmageed, N
Hulsebos, M
Marzocchi, M
Cremaschi, M
Pozzi, R
Avogadro, R
Palmonari, M
Efthymiou, V
Jiménez-Ruiz, E
Chen, J
Cutrona, V
Hassanzadeh, O
Sequeda, J
Srinivas, K
Abdelmageed, N
Hulsebos, M
Marzocchi, M
Cremaschi, M
Pozzi, R
Avogadro, R
Palmonari, M
Publication Year :
2022

Abstract

In this paper, we present MammoTab, a dataset composed of 1M Wikipedia tables extracted from over 20M Wikipedia pages and annotated through Wikidata. The lack of this kind of datasets in the state- of-the-art makes MammoTab a good resource for testing and training Semantic Table Interpretation approaches. The dataset has been designed to cover several key challenges, such as disambiguation, homonymy, and NIL-mentions. The dataset has been evaluated using MTab, one of the best approaches of the SemTab challenge.

Details

Database :
OAIster
Notes :
ELETTRONICO, English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1430691073
Document Type :
Electronic Resource