Unseen Filler Generalization In Attention-based Natural Language Reasoning Models

Authors :: Hsiao-Hua Cheng
Yi-Fu Fu
Chin-Hui Chen
Shou-De Lin
Source :: CogMI
Publication Year :: 2020
Publisher :: IEEE, 2020.
Abstract: Recent natural language reasoning models have achieved human-level accuracy on several benchmark datasets such as bAbI. While the results are impressive, in this paper we argue by experiment analysis that several existing attention-based models have a hard time generalizing themselves to handle name entities not seen in the training data. We thus propose Unseen Filler Generalization (UFG) as a task along with two new datasets to evaluate the filler generalization capability of a natural language reasoning model. We also propose a simple yet general strategy that can be applied to various models to handle the UFG challenge through modifying the entity occurrence distribution in the training data. Such strategy allows the model to encounter unseen entities during training, and thus not to overfit to only a few specific name entities. Our experiments show that this strategy can significantly boost the filler generalization capability of three existing models including Entity Network, Working Memory Network, and Universal Transformers.

Subjects :: Generalization
Computer science
business.industry
Overfitting
Machine learning
computer.software_genre
Data modeling
Task (project management)
Task analysis
Benchmark (computing)
Artificial intelligence
business
computer
Natural language
Transformer (machine learning model)

Database :: OpenAIRE
Journal :: 2020 IEEE Second International Conference on Cognitive Machine Intelligence (CogMI)
Accession number :: edsair.doi...........41874ab2d4ed714c74f404cdcc993c41
Full Text :: https://doi.org/10.1109/cogmi50398.2020.00016

Full Text Access

Tools