Back to Search
Start Over
An extension of the K-means algorithm to clustering skewed data
- Source :
- Computational Statistics. 34:373-394
- Publication Year :
- 2018
- Publisher :
- Springer Science and Business Media LLC, 2018.
-
Abstract
- Grouping similar objects into common groups, also known as clustering, is an important problem of unsupervised machine learning. Various clustering algorithms have been proposed in literature. In recent years, the need to analyze large amounts of data has led to reconsidering some fundamental clustering procedures. One of them is the celebrated K-means algorithm popular among practitioners due to its speedy performance and appealingly intuitive construction. Unfortunately, the algorithm often shows poor performance unless data groups have spherical shapes and approximately same sizes. In many applications, this restriction is so severe that the use of the K-means algorithm becomes questionable, misleading, or simply incorrect. We propose an extension of K-means that preserves the speed and intuitive interpretation of the original algorithm while providing greater flexibility in modeling clusters. The idea of the proposed generalization relies on the exponential transformation of Manly originally designed to obtain near-normally distributed data. The suggested modification is derived and illustrated on several datasets with good results.
- Subjects :
- Statistics and Probability
Flexibility (engineering)
Interpretation (logic)
Theoretical computer science
Computer science
Generalization
05 social sciences
k-means clustering
Extension (predicate logic)
01 natural sciences
010104 statistics & probability
Computational Mathematics
Skewness
0502 economics and business
Unsupervised learning
0101 mathematics
Statistics, Probability and Uncertainty
Cluster analysis
050205 econometrics
Subjects
Details
- ISSN :
- 16139658 and 09434062
- Volume :
- 34
- Database :
- OpenAIRE
- Journal :
- Computational Statistics
- Accession number :
- edsair.doi...........19765f8a82d45c56ec4032dfa45a8afc