1. Bidirectional Gated Temporal Convolution with Attention for text classification
- Author
-
Jiansi Ren, Ruoxiang Wang, Gang Liu, Zhe Chen, and Wei Wu
- Subjects
0209 industrial biotechnology ,Feature aggregation ,business.industry ,Computer science ,Cognitive Neuroscience ,Deep learning ,Feature extraction ,Pattern recognition ,02 engineering and technology ,Computer Science Applications ,Convolution ,020901 industrial engineering & automation ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,business - Abstract
In text classification models based on deep learning, feature extraction and feature aggregation are two key steps. As one of the basic feature extraction methods, CNN has certain limitations due to its inability to effectively extract temporal features from text data. Using max-pooling can significantly reduce the amount of calculation while performing feature aggregation, but it will have an adverse effect on the classification results due to the loss of some text features. In this paper, in response to the above two issues, a Bidirectional Gated Temporal Convolutional Attention(BG-TCA) model is proposed. In the feature extraction stage, the BG-TCA model uses the bidirectional TCN to extract the bidirectional temporal features in text data, and a gating mechanism similar to the LSTM is added between the convolution layers. In the feature aggregation stage, the BG-TCA model uses the attention mechanism to replace the max-pooling method, which makes it possible to distinguish the importance of different features while retaining the text features to the maximum. Finally, experimental results on five benchmark datasets show that the classification accuracy of the BG-TCA model has been greatly improved compared to basic models, and is better than several other state-of-the-art models.
- Published
- 2021
- Full Text
- View/download PDF