1. Improving text classification through pre-attention mechanism-derived lexicons.
- Author
-
Wang, Zhe, Li, Qingbiao, Wang, Bin, Wu, Tong, and Chang, Chengwei
- Subjects
ARTIFICIAL neural networks ,DATA mining ,LEXICON ,BIG data ,CLASSIFICATION - Abstract
A comprehensive and high-quality lexicon plays a crucial role in traditional text classification approaches. It improves the utilization of linguistic knowledge. Although it is helpful for this task, the lexicon has received little attention in current neural network models. First, obtaining a high-quality lexicon is not easy. Second, an effective automated lexicon extraction method is lacking, and most lexicons are handcrafted, which is very inefficient for big data. Finally, there is no effective way to use a lexicon in a neural network. To address these limitations, we propose a pre-attention mechanism for text classification in this study, which can learn the attention values of various words based on their effects on classification tasks. Words with different attention values can form a domain lexicon. Experiments on three publicly available and authoritative benchmark text classification tasks show that our models obtain competitive results compared with state-of-the-art models. For the same dataset, when we use the pre-attention mechanism to obtain attention values, followed by different neural networks, words with high attention values have a high degree of coincidence, which proves the versatility and portability of the pre-attention mechanism. We can obtain stable lexicons using attention values, which is an inspiring method of information extraction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF