Back to Search
Start Over
ChID: A Large-scale Chinese IDiom Dataset for Cloze Test
- Source :
- Scopus-Elsevier, ACL (1)
- Publication Year :
- 2019
-
Abstract
- Cloze-style reading comprehension in Chinese is still limited due to the lack of various corpora. In this paper we propose a large-scale Chinese cloze test dataset ChID, which studies the comprehension of idiom, a unique language phenomenon in Chinese. In this corpus, the idioms in a passage are replaced by blank symbols and the correct answer needs to be chosen from well-designed candidate idioms. We carefully study how the design of candidate idioms and the representation of idioms affect the performance of state-of-the-art models. Results show that the machine accuracy is substantially worse than that of human, indicating a large space for further research.<br />Accepted to ACL 2019 (long paper)
- Subjects :
- Space (punctuation)
FOS: Computer and information sciences
Cloze test
Computer Science - Computation and Language
business.industry
Computer science
02 engineering and technology
Representation (arts)
computer.software_genre
Scale (music)
Comprehension
03 medical and health sciences
0302 clinical medicine
Reading comprehension
030221 ophthalmology & optometry
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
Affect (linguistics)
business
computer
Computation and Language (cs.CL)
Natural language processing
Subjects
Details
- Language :
- English
- Database :
- OpenAIRE
- Journal :
- Scopus-Elsevier, ACL (1)
- Accession number :
- edsair.doi.dedup.....655e3a6a38c5e703ba6bf1e4ec0bcb1f