Back to Search
Start Over
A New Fuzzy Co-clustering Algorithm for Categorization of Datasets with Overlapping Clusters.
- Source :
- Advanced Data Mining & Applications (9783540370253); 2006, p328-339, 12p
- Publication Year :
- 2006
-
Abstract
- Fuzzy co-clustering is a method that performs simultaneous fuzzy clustering of objects and features. In this paper, we introduce a new fuzzy co-clustering algorithm for high-dimensional datasets called Cosine-Distance-based & Dual-partitioning Fuzzy Co-clustering (CODIALING FCC). Unlike many existing fuzzy co-clustering algorithms, CODIALING FCC is a dual-partitioning algorithm. It clusters the features in the same manner as it clusters the objects, that is, by partitioning them according to their natural groupings. It is also a cosine-distance-based algorithm because it utilizes the cosine distance to capture the belongingness of objects and features in the co-clusters. Our main purpose of introducing this new algorithm is to improve the performance of some prominent existing fuzzy co-clustering algorithms in dealing with datasets with high overlaps. In our opinion, this is very crucial since most real-world datasets involve significant amount of overlaps in their inherent clustering structures. We discuss how this improvement can be made through the dual-partitioning formulation adopted. Experimental results on a toy problem and five large benchmark document datasets demonstrate the effectiveness of CODIALING FCC in handling overlaps better. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISBNs :
- 9783540370253
- Database :
- Complementary Index
- Journal :
- Advanced Data Mining & Applications (9783540370253)
- Publication Type :
- Book
- Accession number :
- 32864284
- Full Text :
- https://doi.org/10.1007/11811305_36