Back to Search Start Over

Probability model selection and parameter evolutionary estimation for clustering imbalanced data without sampling.

Authors :
Fan, Jiancong
Niu, Zhonghan
Liang, Yongquan
Zhao, Zhongying
Source :
Neurocomputing. Oct2016, Vol. 211, p172-181. 10p.
Publication Year :
2016

Abstract

Data imbalance problems arisen from the accumulated amount of data, especially from big data, have become a challenging issue in recent years. In imbalanced data, those minor data sets probably imply much important patterns. Although there are some approaches for discovering class patterns, an emerging issue is that few of them have been applied to cluster minor patterns. In common, the minor samples are submerged in big data, and they are often ignored and misclassified into major patterns without supervision of training set. Since clustering minorities is an uncertain process, in this paper, we employ model selection and evolutionary computation to solve the uncertainty and concealment of the minor data in imbalanced data clustering. Given data set, model selection is to select a model from a set of candidate models. We select probability models as candidate models because they can solve uncertainty effectively and thereby are well-suited to data imbalance. Considering the difficulty of estimating the models' parameters, we employ evolutionary process to adjust and estimate the optimal parameters. Experimental results show that our proposed approach for clustering imbalanced data has the ability of searching and discovering minor patterns, and can also obtain better performances than many other relevant clustering algorithms in several performance indices. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
211
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
118266700
Full Text :
https://doi.org/10.1016/j.neucom.2015.10.140