Back to Search
Start Over
Text Categorization in Non-linear Semantic Space.
- Source :
- AI*IA 2007: Artificial Intelligence & Human-Oriented Computing; 2007, p749-756, 8p
- Publication Year :
- 2007
-
Abstract
- Automatic Text Categorization (TC) is a complex and useful task for many natural language applications, and is usually performed by using a set of manually classified documents, i.e. a training collection. Term-based representation of documents has found widespread use in TC. However, one of the main shortcomings of such methods is that they largely disregard lexical semantics and, as a consequence, are not sufficiently robust with respect to variations in word usage. In this paper we design, implement, and evaluate a new text classification technique. Our main idea consists in finding a series of projections of the training data by using a new, modified LSI algorithm, projecting all training instances to the low-dimensional subspace found in the previous step, and finally inducing a binary search on the projected low-dimensional data. Our conclusion is that, with all its simplicity and efficiency, our approach is comparable to SVM accuracy on classification. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISBNs :
- 9783540747819
- Database :
- Complementary Index
- Journal :
- AI*IA 2007: Artificial Intelligence & Human-Oriented Computing
- Publication Type :
- Book
- Accession number :
- 33106281
- Full Text :
- https://doi.org/10.1007/978-3-540-74782-6_65