Back to Search Start Over

M-DA: A Multifeature Text Data-Augmentation Model for Improving Accuracy of Chinese Sentiment Analysis

Authors :
Liya Wang
Xinxin Xu
Changhui Liu
Zhe Chen
Source :
Scientific Programming.
Publication Year :
2022
Publisher :
Hindawi, 2022.

Abstract

A neural network based on a word or character embedding is a mainstream model framework in text sentiment analysis and has achieved good results. However, there is a lack of learning about POS-Tagging and Sequence-Tagging. In this research, we propose a multifeature text data-augmentation model (M-DA) with a multiple-input single-output network structure to overcome this problem of Chinese text sentiment analysis. First, this paper sequentially obtains various sequences of Chinese text, including word sequence, pos sequence, char sequence, char_pos sequence, and char_4tag sequence, we use char_pos and the char_4tag to construct a new sequence (4tag_pos) and then use 4tag_pos to mark the characters to obtain the reconstructed characters sequence (char_4tag_pos), so as to achieve the purpose of text enhancement. Then, the Word2Vec method is used to train the initial reconstruction of the character embedding. Finally, the BiLSTM network is used to capture the long-term dependence between the sequences, and the dropout technology and attention are used to improve the accuracy. In the course of the experiment, we also realized that it is better to use the original sequence and the sequence after text enhancement technology as the input of the BiLSTM network. Therefore, our proposed model also discusses the concatenate or dot method to fuse multiple sequences as the final embedding. Multigroup comparison experiments are conducted on the data set, and the results show that the proposed M-DA model is superior to the traditional deep learning technology in terms of accuracy, recall rate, f-measure, and accuracy, and the relative time cost is small.

Details

Language :
English
ISSN :
10589244
Database :
OpenAIRE
Journal :
Scientific Programming
Accession number :
edsair.doi.dedup.....24aa3159f9001779fcef0b2c546d51d7
Full Text :
https://doi.org/10.1155/2022/3264378