Back to Search Start Over

Few-NERD: A Few-Shot Named Entity Recognition Dataset

Authors :
Ding, Ning
Xu, Guangwei
Chen, Yulin
Wang, Xiaobin
Han, Xu
Xie, Pengjun
Zheng, Hai-Tao
Liu, Zhiyuan
Publication Year :
2021

Abstract

Recently, considerable literature has grown up around the theme of few-shot named entity recognition (NER), but little published benchmark data specifically focused on the practical and challenging task. Current approaches collect existing supervised NER datasets and re-organize them to the few-shot setting for empirical study. These strategies conventionally aim to recognize coarse-grained entity types with few examples, while in practice, most unseen entity types are fine-grained. In this paper, we present Few-NERD, a large-scale human-annotated few-shot NER dataset with a hierarchy of 8 coarse-grained and 66 fine-grained entity types. Few-NERD consists of 188,238 sentences from Wikipedia, 4,601,160 words are included and each is annotated as context or a part of a two-level entity type. To the best of our knowledge, this is the first few-shot NER dataset and the largest human-crafted NER dataset. We construct benchmark tasks with different emphases to comprehensively assess the generalization capability of models. Extensive empirical results and analysis show that Few-NERD is challenging and the problem requires further research. We make Few-NERD public at https://ningding97.github.io/fewnerd/.<br />Comment: Accepted by ACL-IJCNLP 2021 (long paper), update

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2105.07464
Document Type :
Working Paper