Back to Search Start Over

Heterogeneous graph contrastive learning with adaptive data augmentation for semi‐supervised short text classification.

Authors :
Wu, Mingqiang
Xu, Zhuoming
Zheng, Lei
Source :
Expert Systems. Oct2024, p1. 28p. 5 Illustrations.
Publication Year :
2024

Abstract

Short text classification has been widely used in many fields. Due to the scarcity of labelled data, implementing short text classification under semi‐supervised learning setting has become increasingly popular. Semi‐supervised short text classification methods based on graph neural networks can achieve state‐of‐the‐art classification performance by utilizing the expressive power of graph neural networks. However, these methods usually fail to mine the hidden patterns of a large amount of short text node data in the graph to optimize the short text node embeddings, which limits the semantic representation power of the short texts, thus leading to suboptimal classification performance. To overcome the limitation, this paper proposes a novel semi‐supervised short text classification method called the Heterogeneous Graph Contrastive Learning with Adaptive Data Augmentation (HGCLADA). In the knowledge bases guided soft prompt‐based data augmentation component, the related words of the tag words are used to optimize the soft prompts for generating diverse augmented samples. In the heterogeneous graph contrastive learning framework component, a heterogeneous graph that is constructed using short texts and keywords and an effective edge augmentation scheme based on a short text clustering algorithm are proposed. The optimized short text embeddings can be obtained to achieve the effective semi‐supervised short text classification. Extensive experiments on six benchmark datasets show that our HGCLADA method outperforms four classes of state‐of‐the‐art methods in terms of classification accuracy, especially with significant performance improvements of 8.74% on the TagMyNews dataset when each class only contains 20 labelled data. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02664720
Database :
Academic Search Index
Journal :
Expert Systems
Publication Type :
Academic Journal
Accession number :
180147659
Full Text :
https://doi.org/10.1111/exsy.13744