Evaluating and enhancing cross-domain rank predictability of textual entailment datasets.

Authors :: Lee, Cheng-Wei
Lin, Chuan-Jie
Shima, Hideki
Hsu, Wen-Lian
Source :: 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI); 1/ 1/2012, p51-58, 8p
Publication Year :: 2012
Abstract: Textual Entailment (TE) is the task of recognizing entailment, paraphrase, and contradiction relations between a given text pair. The goal of textual entailment research is to develop a core inference component that can be applied to various domains, such as IR or NLP. Since the domain that a TE system applies to may be different from its source domain, it is crucial to develop proper datasets for measuring the cross-domain ability of a TE system. We propose using Kendall's tau to measure a dataset's cross-domain rank predictability. Our analysis shows that incorporating “artificial pairs” into a dataset helps enhance its rank predictability. We also find that the completeness of guidelines has no obvious effect on the rank predictability of a dataset. To validate these findings, more investigation is needed; however these findings suggest some new directions for the creation of TE datasets in the future. [ABSTRACT FROM PUBLISHER]

Language :: English
ISBNs :: 9781467322829
Database :: Complementary Index
Journal :: 2012 IEEE 13th International Conference on Information Reuse & Integration (IRI)
Publication Type :: Conference
Accession number :: 86536197
Full Text :: https://doi.org/10.1109/IRI.2012.6302990

Full Text Access

Tools