Back to Search Start Over

Text Categorization in Non-linear Semantic Space.

Authors :
Carbonell, Jaime G.
Siekmann, Jörg
Basili, Roberto
Pazienza, Maria Teresa
Biancalana, Claudio
Micarelli, Alessandro
Source :
AI*IA 2007: Artificial Intelligence & Human-Oriented Computing; 2007, p749-756, 8p
Publication Year :
2007

Abstract

Automatic Text Categorization (TC) is a complex and useful task for many natural language applications, and is usually performed by using a set of manually classified documents, i.e. a training collection. Term-based representation of documents has found widespread use in TC. However, one of the main shortcomings of such methods is that they largely disregard lexical semantics and, as a consequence, are not sufficiently robust with respect to variations in word usage. In this paper we design, implement, and evaluate a new text classification technique. Our main idea consists in finding a series of projections of the training data by using a new, modified LSI algorithm, projecting all training instances to the low-dimensional subspace found in the previous step, and finally inducing a binary search on the projected low-dimensional data. Our conclusion is that, with all its simplicity and efficiency, our approach is comparable to SVM accuracy on classification. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISBNs :
9783540747819
Database :
Complementary Index
Journal :
AI*IA 2007: Artificial Intelligence & Human-Oriented Computing
Publication Type :
Book
Accession number :
33106281
Full Text :
https://doi.org/10.1007/978-3-540-74782-6_65