Back to Search Start Over

Different Word Representation for Text Classification: A Comparative Study

Authors :
Lubna Alhenki
Mohammed Al-Dhelaan
Eman Alsagour
Source :
AICCSA
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Due to the large amounts of words usually present in documents, some of their appearances can complicate the classification process and make it less accurate. Accordingly, word representation methods have been employed to handle this issue through the use of a comparative study. In this study, we compare the effectiveness of both word embedding and TF-IDF weighting schema by applying four classifiers to assess the accuracy of the classification. To evaluate the effectiveness of our study, it was tested on the popular 20Newsgroup text document dataset. Following our experimentation, we found that using the TF-IDF method and ANN classifiers on the 20Newsgroup dataset greatly enhanced the text documents' classification compared against the use of word embedding and other classifiers.

Details

Database :
OpenAIRE
Journal :
2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA)
Accession number :
edsair.doi...........c78b3d3f27177aee314c0e342e737bb9