Back to Search
Start Over
MammoTab: a giant and comprehensive dataset for Semantic Table Interpretation
- Publication Year :
- 2022
-
Abstract
- In this paper, we present MammoTab, a dataset composed of 1M Wikipedia tables extracted from over 20M Wikipedia pages and annotated through Wikidata. The lack of this kind of datasets in the state- of-the-art makes MammoTab a good resource for testing and training Semantic Table Interpretation approaches. The dataset has been designed to cover several key challenges, such as disambiguation, homonymy, and NIL-mentions. The dataset has been evaluated using MTab, one of the best approaches of the SemTab challenge.
Details
- Database :
- OAIster
- Notes :
- ELETTRONICO, English
- Publication Type :
- Electronic Resource
- Accession number :
- edsoai.on1430691073
- Document Type :
- Electronic Resource