Back to Search Start Over

Hyperplane Division in Fuzzy C-Means: Clustering Big Data.

Authors :
Shen, Yinghua
Pedrycz, Witold
Chen, Yuan
Wang, Xianmin
Gacek, Adam
Source :
IEEE Transactions on Fuzzy Systems; Nov2020, Vol. 28 Issue 11, p3032-3046, 15p
Publication Year :
2020

Abstract

Big data with a large number of observations (samples) have posed genuine challenges for fuzzy clustering algorithms and fuzzy C-means (FCM), in particular. In this article, we propose an original algorithm referred to as a hyperplane division method to split the entire data set into disjoint subsets. By disjoint subsets, we mean that the data subspaces (parts of the entire data space), each of which is supported or spanned by the data points in the corresponding subset, do not overlap each other. The disjoint subsets turned out to be beneficial to the improvement of the quality of the clusters formed by the clustering algorithms. Moreover, considering that either a large number (say, thousands) or a small number (say, a few) of clusters may be pursued in the clustering task, we propose corresponding strategies (based on the hyperplane division method) to make clustering processes feasible, efficient, and effective. By validating the proposed strategies on both synthetic and publicly available data, we show their superiority (in terms of both efficiency and effectiveness) manifested in a visible way over the method of clustering the entire data and over some representative big data clustering methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10636706
Volume :
28
Issue :
11
Database :
Complementary Index
Journal :
IEEE Transactions on Fuzzy Systems
Publication Type :
Academic Journal
Accession number :
146892074
Full Text :
https://doi.org/10.1109/TFUZZ.2019.2947231