Back to Search
Start Over
Different Word Representation for Text Classification: A Comparative Study
- Source :
- AICCSA
- Publication Year :
- 2019
- Publisher :
- IEEE, 2019.
-
Abstract
- Due to the large amounts of words usually present in documents, some of their appearances can complicate the classification process and make it less accurate. Accordingly, word representation methods have been employed to handle this issue through the use of a comparative study. In this study, we compare the effectiveness of both word embedding and TF-IDF weighting schema by applying four classifiers to assess the accuracy of the classification. To evaluate the effectiveness of our study, it was tested on the popular 20Newsgroup text document dataset. Following our experimentation, we found that using the TF-IDF method and ANN classifiers on the 20Newsgroup dataset greatly enhanced the text documents' classification compared against the use of word embedding and other classifiers.
- Subjects :
- Word embedding
Artificial neural network
Computer science
business.industry
Text document
02 engineering and technology
computer.software_genre
Weighting
03 medical and health sciences
ComputingMethodologies_PATTERNRECOGNITION
0302 clinical medicine
Schema (psychology)
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
030221 ophthalmology & optometry
0202 electrical engineering, electronic engineering, information engineering
Word representation
020201 artificial intelligence & image processing
Artificial intelligence
tf–idf
business
computer
Natural language processing
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA)
- Accession number :
- edsair.doi...........c78b3d3f27177aee314c0e342e737bb9