Back to Search
Start Over
COLI: Collaborative clustering missing data imputation
- Source :
- Electrical and Computer Engineering Publications
- Publication Year :
- 2021
- Publisher :
- Elsevier BV, 2021.
-
Abstract
- Missing data imputation plays an important role in the data cleansing process. Clustering algorithms have been widely used for missing data imputation, yet, there is little research done on the use of clustering ensemble for missing data imputation, which aggregates multiple clustering results. This paper proposes a novel collaborative clustering-based imputation method, called COLI, which uses the imputation quality as a key criterion for the exchange of information between different clustering results. To the best of our knowledge, this is the first study on the impact of collaborative clustering on imputation performance. The main contributions of this paper are three-fold. A novel missing value imputation based on collaborative clustering is proposed, three amputation strategies are used to induce missingness on various complete and publicly available datasets with different mechanisms, distributions, and ratios, which allows evaluating the imputation quality of the proposed method in estimating missing values of various numerical datasets with different missingness mechanisms, distributions, and ratios. The proposed method is compared to several state-of-the-art imputation methods and attained results demonstrate that the proposed method is an effective method for handling missing data.
- Subjects :
- Data cleansing
Computer science
02 engineering and technology
computer.software_genre
03 medical and health sciences
Artificial Intelligence
Missing data imputation
0202 electrical engineering, electronic engineering, information engineering
Missing value imputation
Statistics::Methodology
Imputation (statistics)
Cluster analysis
030304 developmental biology
0303 health sciences
Collaborative clustering
Statistics::Applications
Data amputation
Missing data
Data_GENERAL
Signal Processing
Key (cryptography)
020201 artificial intelligence & image processing
Computer Vision and Pattern Recognition
Data mining
computer
Software
Subjects
Details
- ISSN :
- 01678655
- Volume :
- 152
- Database :
- OpenAIRE
- Journal :
- Pattern Recognition Letters
- Accession number :
- edsair.doi.dedup.....e5f7efadfd6286f74058808222563c6c