Back to Search Start Over

Possibilistic fuzzy co-clustering of large document collections

Authors :
Tjhi, William-Chandra
Chen, Lihui
Source :
Pattern Recognition. Dec2007, Vol. 40 Issue 12, p3452-3466. 15p.
Publication Year :
2007

Abstract

Abstract: In this paper we propose a new co-clustering algorithm called possibilistic fuzzy co-clustering (PFCC) for automatic categorization of large document collections. PFCC integrates a possibilistic document clustering technique and a combined formulation of fuzzy word ranking and partitioning into a fast iterative co-clustering procedure. This novel framework brings about simultaneously some benefits including robustness in the presence of document and word outliers, rich representations of co-clusters, highly descriptive document clusters, a good performance in a high-dimensional space, and a reduced sensitivity to the initialization in the possibilistic clustering. We present the detailed formulation of PFCC together with the explanations of the motivations behind. The advantages over other existing works and the algorithm''s proof of convergence are provided. Experiments on several large document data sets demonstrate the effectiveness of PFCC. [Copyright &y& Elsevier]

Details

Language :
English
ISSN :
00313203
Volume :
40
Issue :
12
Database :
Academic Search Index
Journal :
Pattern Recognition
Publication Type :
Academic Journal
Accession number :
26148586
Full Text :
https://doi.org/10.1016/j.patcog.2007.04.017