Back to Search
Start Over
Hate Speech Detection on Twitter in Indonesia with Feature Expansion Using GloVe
- Source :
- Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), Vol 5, Iss 6, Pp 1044-1051 (2021)
- Publication Year :
- 2021
- Publisher :
- Ikatan Ahli Informatika Indonesia, 2021.
-
Abstract
- Twitter is one of the popular social media to channel opinions in the form of criticism and suggestions. Criticism could be a form of hate speech if the criticism implies attacking something (an individual, race, or group). With the limit of 280 characters in a tweet, there is often a vocabulary mismatch due to abbreviations which can be solved with word embedding. This study utilizes feature expansion to reduce vocabulary mismatches in hate speech on Twitter containing Indonesian language by using Global Vectors (GloVe). Feature selection related to the best model is carried out using the Logistic Regression (LR), Random Forest (RF), and Artificial Neural Network (ANN) algorithms. The results show that the Random Forest model with 5.000 features and a combination of TF-IDF and Tweet corpus built with GloVe produce the best accuracy rate between the other models with an average of 88,59% accuracy score, which is 1,25% higher than the predetermined Baseline. The number of features used is proven to improve the performance of the system.
Details
- Language :
- English
- ISSN :
- 25800760
- Volume :
- 5
- Issue :
- 6
- Database :
- Directory of Open Access Journals
- Journal :
- Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi)
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.93814c04f84a9ba6742b5aa07681ec
- Document Type :
- article
- Full Text :
- https://doi.org/10.29207/resti.v5i6.3521