Back to Search Start Over

Enhancing machine learning-based sentiment analysis through feature extraction techniques.

Authors :
A. Semary, Noura
Ahmed, Wesam
Amin, Khalid
Pławiak, Paweł
Hammad, Mohamed
Source :
PLoS ONE. 2/14/2024, Vol. 19 Issue 2, p1-19. 19p.
Publication Year :
2024

Abstract

A crucial part of sentiment classification is featuring extraction because it involves extracting valuable information from text data, which affects the model's performance. The goal of this paper is to help in selecting a suitable feature extraction method to enhance the performance of sentiment analysis tasks. In order to provide directions for future machine learning and feature extraction research, it is important to analyze and summarize feature extraction techniques methodically from a machine learning standpoint. There are several methods under consideration, including Bag-of-words (BOW), Word2Vector, N-gram, Term Frequency- Inverse Document Frequency (TF-IDF), Hashing Vectorizer (HV), and Global vector for word representation (GloVe). To prove the ability of each feature extractor, we applied it to the Twitter US airlines and Amazon musical instrument reviews datasets. Finally, we trained a random forest classifier using 70% of the training data and 30% of the testing data, enabling us to evaluate and compare the performance using different metrics. Based on our results, we find that the TD-IDF technique demonstrates superior performance, with an accuracy of 99% in the Amazon reviews dataset and 96% in the Twitter US airlines dataset. This study underscores the paramount significance of feature extraction in sentiment analysis, endowing pragmatic insights to elevate model performance and steer future research pursuits. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19326203
Volume :
19
Issue :
2
Database :
Academic Search Index
Journal :
PLoS ONE
Publication Type :
Academic Journal
Accession number :
175442147
Full Text :
https://doi.org/10.1371/journal.pone.0294968