Back to Search
Start Over
Robust Model Design by Comparative Evaluation of Clustering Algorithms
- Source :
- IEEE Access, Vol 11, Pp 88135-88151 (2023)
- Publication Year :
- 2023
- Publisher :
- IEEE, 2023.
-
Abstract
- The K-means algorithm, widely used in cluster analysis, is a centroid-based clustering method known for its high efficiency and scalability. However, in realistic situations, the operating environment is susceptible to contamination issues caused by outliers and distribution departures, which may lead to clustering results from K-means that are distorted or rendered invalid. In this paper, we introduce three other alternative algorithms, including K-weighted-medians, K-weighted-L2-medians, and K-weighted-HLs, to address these issues under the consideration of data with weights. The impact of contamination is investigated by examining the estimation effects on optimal cluster centroids. We explore the robustness of the clustering algorithms from the perspective of the breakdown point, and then conduct experiments on simulated and real datasets to evaluate their performance using two new numerical metrics: relative efficiencies based on generalized variance and average Euclidean distance. The results demonstrate the effectiveness of the proposed K-weighted-HLs algorithm, surpassing other algorithms in scenarios involving both contamination issues.
Details
- Language :
- English
- ISSN :
- 21693536
- Volume :
- 11
- Database :
- Directory of Open Access Journals
- Journal :
- IEEE Access
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.54ae6baa35a046328bcfd8835813d8a4
- Document Type :
- article
- Full Text :
- https://doi.org/10.1109/ACCESS.2023.3306023