Back to Search
Start Over
Probability model selection and parameter evolutionary estimation for clustering imbalanced data without sampling.
- Source :
-
Neurocomputing . Oct2016, Vol. 211, p172-181. 10p. - Publication Year :
- 2016
-
Abstract
- Data imbalance problems arisen from the accumulated amount of data, especially from big data, have become a challenging issue in recent years. In imbalanced data, those minor data sets probably imply much important patterns. Although there are some approaches for discovering class patterns, an emerging issue is that few of them have been applied to cluster minor patterns. In common, the minor samples are submerged in big data, and they are often ignored and misclassified into major patterns without supervision of training set. Since clustering minorities is an uncertain process, in this paper, we employ model selection and evolutionary computation to solve the uncertainty and concealment of the minor data in imbalanced data clustering. Given data set, model selection is to select a model from a set of candidate models. We select probability models as candidate models because they can solve uncertainty effectively and thereby are well-suited to data imbalance. Considering the difficulty of estimating the models' parameters, we employ evolutionary process to adjust and estimate the optimal parameters. Experimental results show that our proposed approach for clustering imbalanced data has the ability of searching and discovering minor patterns, and can also obtain better performances than many other relevant clustering algorithms in several performance indices. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 09252312
- Volume :
- 211
- Database :
- Academic Search Index
- Journal :
- Neurocomputing
- Publication Type :
- Academic Journal
- Accession number :
- 118266700
- Full Text :
- https://doi.org/10.1016/j.neucom.2015.10.140