Back to Search Start Over

Topic Clustering and Classification on Final Project Reports: a Comparison of Traditional and Modern Approaches.

Authors :
Bunyamin, Hendra
Heriyanto
Novianti, Stevani
Sulistiani, Lisan
Source :
IAENG International Journal of Computer Science; Sep2019, Vol. 46 Issue 3, p506-511, 6p
Publication Year :
2019

Abstract

Text clustering and classification has been studied at large in machine learning literature. For clustering text, topic modeling algorithms are statistical methods to discover unseen structures in archives of documents. Equally important, Convolutional Neural Networks (ConvNets) have been successfully applied for classifying text without knowing information about syntactic and semantic aspects of a language. In this paper, we utilizes both clustering and classification algorithms to organize and classify topics from final project reports. In clustering task, we examine two techniques, that are Latent Dirichlet Allocation (LDA) functioning as a unigram model and LDA supported by a Skip-gram model. Our results show each topical distribution of words found by the techniques are truly representing keywords from every topic; to elaborate, skip-gram model that works hand in hand with LDA are suitable to acquire topical words from the final report topics. For our classification task, we analyze the application of ConvNets, artificial neural nets with ReLU activation functions, and traditional algorithms. Concretely, our findings suggest that selecting parts of a report that contains essential information is very important for ConvNets to learn. Additionally, traditional algorithms is more preferrable than neural nets-based algorithms if the size of dataset is less than 20,000; as a result, our traditional algorithms, specifically Ridge classifier, Passive-Aggressive, and Support Vector Machines outperform neural nets-based algorithms significantly. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1819656X
Volume :
46
Issue :
3
Database :
Supplemental Index
Journal :
IAENG International Journal of Computer Science
Publication Type :
Academic Journal
Accession number :
138531442