Back to Search Start Over

Empirical evaluation of feature projection algorithms for multi-view text classification.

Authors :
Mirończuk, Marcin Michał
Protasiewicz, Jarosław
Pedrycz, Witold
Source :
Expert Systems with Applications. Sep2019, Vol. 130, p97-112. 16p.
Publication Year :
2019

Abstract

• A multi-view text classification may be better than a simple text classification. • Feature projection methods may outperform a multi-view text classification approach. • The proposed ranking method allows for selecting the best classification model. This study aims to propose (i) a multi-view text classification method and (ii) a ranking method that allows for selecting the best information fusion layer among many variations. Multi-view document classification is worth a detailed study as it makes it possible to combine different feature sets into yet another view that further improves text classification. For this purpose, we propose a multi-view framework for text classification that is composed of two levels of information fusion. At the first level, classifiers are constructed using different data views, i.e. different vector space models by various machine learning algorithms. At the second level, the information fusion layer uses input information using a features projection method and a meta-classifier modelled by a selected machine learning algorithm. A final decision based on classification results produced by the models positioned at the first layer is reached. Moreover, we propose a ranking method to assess various configurations of the fusion layer. We use heuristics that utilise statistical properties of F-score values calculated for classification results produced at the fusion layer. The information fusion layer of the classification framework and ranking method has been empirically evaluated. For this purpose, we introduce a use case checking whether companies' domains identify their innovativeness. The results empirically demonstrate that the information fusion layer enhances classification quality. The Friedman's aligned rank and Wilcoxon signed-rank statistical tests and the effect size support this hypothesis. In addition, the Spearman statistical test carried out for the obtained results demonstrated that the assessment made by the proposed ranking method converges to a well-established method named Hellinger - The Technique for Order Preference by Similarity to Ideal Solution (H-TOPSIS). Thus, the proposed approach may be used for the assessment of classifier performance. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
130
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
136272265
Full Text :
https://doi.org/10.1016/j.eswa.2019.04.020