Back to Search
Start Over
Performance of using LDA for Chinese news text classification
- Source :
- CCECE
- Publication Year :
- 2015
- Publisher :
- IEEE, 2015.
-
Abstract
- Chinese text classification is always challenging, especially when data are high dimensional and sparse. In this paper, we are interested in the way of text representation and dimension reduction in Chinese text classification. First, we introduces a topic model — Latent Dirichlet Allocation(LDA), which is uses LDA model as a dimension reduction method. Second, we choose Support Vector Machine(SVM) as the classification algorithm. Next, a method of text classification based on LDA and SVM is described. Finally, we choose documents with large number of Chinese text for experiment. Compared with LDA method and the traditional TF∗IDF method, the experimental results show that LDA method runs a better results both on the classification accuracy and running time.
- Subjects :
- Topic model
Computer science
business.industry
Dimensionality reduction
Pattern recognition
Machine learning
computer.software_genre
Latent Dirichlet allocation
Support vector machine
symbols.namesake
Statistical classification
ComputingMethodologies_PATTERNRECOGNITION
symbols
Artificial intelligence
business
Representation (mathematics)
computer
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2015 IEEE 28th Canadian Conference on Electrical and Computer Engineering (CCECE)
- Accession number :
- edsair.doi...........25741880b44d45d74ed1a1f279856a0f