Back to Search Start Over

An unsupervised semantic text similarity measurement model in resource-limited scenes.

Authors :
Xiao, Qi
Qin, Yunchuan
Li, Kenli
Tang, Zhuo
Wu, Fan
Liu, Zhizhong
Source :
Information Sciences. Nov2022, Vol. 616, p444-460. 17p.
Publication Year :
2022

Abstract

As the basis of many artificial intelligence tasks, text similarity measurement has received extensive attention in current studies. However, few of them focus on the resource-limited scenes (i.e., limited computational resources and few training datasets), which are becoming increasingly popular and challenging with the development of the Internet of Things. Worse still, popular methods such as the deep-neural-network-based methods may lose their power in such scenes, since they typically require considerable computational resources. As for most current traditional methods, they also have issues of not effectively exploiting the semantic information in the sentences. As an alternative, this paper proposes a lightweight and semantically rich text similarity measurement model named the TES-TK model. In this model, a sentence is first transformed into a tree structure called TES-Tree with the integration of syntactic information, semantic knowledge, and topic distribution, aiming to comprehensively represent the multidimensional semantics of sentences. Afterward, a modified tree kernel model is designed to calculate the similarity between each pair of TES-Trees. In this way, the similarity score between the two related sentences can be retrieved. Experiments on 19 public benchmark datasets (STS2012–2015) demonstrate that the proposed approach exhibits significantly better performance than the compared eight peer methods on most datasets. Especially in resource-limited scenes, our approach achieved highly competitive results compared with the latest methods, such as BERT. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00200255
Volume :
616
Database :
Academic Search Index
Journal :
Information Sciences
Publication Type :
Periodical
Accession number :
160439412
Full Text :
https://doi.org/10.1016/j.ins.2022.10.127