Back to Search Start Over

ChID: A Large-scale Chinese IDiom Dataset for Cloze Test

Authors :
Aixin Sun
Chujie Zheng
Minlie Huang
Source :
Scopus-Elsevier, ACL (1)
Publication Year :
2019

Abstract

Cloze-style reading comprehension in Chinese is still limited due to the lack of various corpora. In this paper we propose a large-scale Chinese cloze test dataset ChID, which studies the comprehension of idiom, a unique language phenomenon in Chinese. In this corpus, the idioms in a passage are replaced by blank symbols and the correct answer needs to be chosen from well-designed candidate idioms. We carefully study how the design of candidate idioms and the representation of idioms affect the performance of state-of-the-art models. Results show that the machine accuracy is substantially worse than that of human, indicating a large space for further research.<br />Accepted to ACL 2019 (long paper)

Details

Language :
English
Database :
OpenAIRE
Journal :
Scopus-Elsevier, ACL (1)
Accession number :
edsair.doi.dedup.....655e3a6a38c5e703ba6bf1e4ec0bcb1f