1. A Chinese text classification model based on vector space and semantic meaning
- Author
-
Shao-Min Zhang and Bao-Yi Wang
- Subjects
Noisy text analytics ,Text simplification ,Computer science ,business.industry ,Text segmentation ,Text graph ,computer.software_genre ,Semantic network ,Text mining ,Explicit semantic analysis ,Artificial intelligence ,business ,Cluster analysis ,Semantic compression ,computer ,Natural language ,Natural language processing ,Latent semantic indexing - Abstract
Aiming at the status that various electronic text materials are increasing rapidly, This work brings forward a model of automatic classification of electronic text information in order to manage and use these text information effectively: the algorithm of segmentation of word based on word dictionary and statistics, preprocessing of text, design of weight function of feature words and collecting them, expression of text vector space, latent semantic indexing and clustering algorithm of text, etc. The experiment has proved that the model had satisfactory classification effect as well as high calculation and storage efficiency.
- Published
- 2005
- Full Text
- View/download PDF