1. UESTS: An Unsupervised Ensemble Semantic Textual Similarity Method
- Author
-
Reem Bahgat, Basma Hassan, Ibrahim Farag, and Samir E. AbdelRahman
- Subjects
General Computer Science ,Computer science ,BabelNet ,computer.software_genre ,Semantics ,Semantic network ,Semantic similarity ,String kernel ,Similarity (psychology) ,General Materials Science ,Electrical and Electronic Engineering ,Semantic textual similarity ,business.industry ,Deep learning ,General Engineering ,word alignment ,string kernel ,Edit distance ,Artificial intelligence ,text processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,business ,computer ,lcsh:TK1-9971 ,Natural language processing ,Word (computer architecture) ,SemEval - Abstract
Semantic textual similarity (STS) is the task of assessing the degree of similarity between two texts in terms of meaning. Several approaches have been proposed in the literature to determine the semantic similarity between texts. The most promising work recently presented in the literature was supervised approaches. Unsupervised STS approaches are characterized by the fact that they do not require learning data, but they still suffer from some limitations. Word alignment has been widely used in the state-of-the-art approaches. From this point, this paper has three contributions. First, a new synset-oriented word aligner is presented, which relies on a huge multilingual semantic network named BabelNet. Second, three unsupervised STS approaches are proposed: string kernel-based (SK), alignment-based (AL), and weighted alignment-based (WAL). Third, some limitations of the state-of-the-art approaches are tackled, and different similarity methods are demonstrated to be complementary with each other by proposing an unsupervised ensemble STS (UESTS) approach. The UESTS incorporates the merits of four similarity measures: proposed alignment-based, surface-based, corpus-based, and enhanced edit distance. The experimental results proved that the participation of the proposed aligner in STS is effective. Over all the evaluation data sets, the proposed UESTS outperforms the state-of-the-art unsupervised approaches, which is a promising result.
- Published
- 2019