Back to Search Start Over

An Enhanced Sentiment Analysis Framework Based on Pre-Trained Word Embedding.

Authors :
Mohamed, Ensaf Hussein
Moussa, Mohammed ElSaid
Haggag, Mohamed Hassan
Source :
International Journal of Computational Intelligence & Applications. Dec2020, Vol. 19 Issue 4, pN.PAG-N.PAG. 20p.
Publication Year :
2020

Abstract

Sentiment analysis (SA) is a technique that lets people in different fields such as business, economy, research, government, and politics to know about people's opinions, which greatly affects the process of decision-making. SA techniques are classified into: lexicon-based techniques, machine learning techniques, and a hybrid between both approaches. Each approach has its limitations and drawbacks, the machine learning approach depends on manual feature extraction, lexicon-based approach relies on sentiment lexicons that are usually unscalable, unreliable, and manually annotated by human experts. Nowadays, word-embedding techniques have been commonly used in SA classification. Currently, Word2Vec and GloVe are some of the most accurate and usable word embedding techniques, which can transform words into meaningful semantic vectors. However, these techniques ignore sentiment information of texts and require a huge corpus of texts for training and generating accurate vectors, which are used as inputs of deep learning models. In this paper, we propose an enhanced ensemble classifier framework. Our framework is based on our previously published lexicon-based method, bag-of-words, and pre-trained word embedding, first the sentence is preprocessed by removing stop-words, POS tagging, stemming and lemmatization, shortening exaggerated word. Second, the processed sentence is passed to three modules, our previous lexicon-based method (Sum Votes), bag-of-words module and semantic module (Word2Vec and Glove) and produced feature vectors. Finally, the previous features vectors are fed into 11 different classifiers. The proposed framework is tested and evaluated over four datasets with five different lexicons, the experiment results show that our proposed model outperforms the previous lexicon based and the machine learning methods individually. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14690268
Volume :
19
Issue :
4
Database :
Academic Search Index
Journal :
International Journal of Computational Intelligence & Applications
Publication Type :
Academic Journal
Accession number :
147252063
Full Text :
https://doi.org/10.1142/S1469026820500315