Back to Search
Start Over
Greedy centroid initialization for federated K-means.
- Source :
- Knowledge & Information Systems; Jun2024, Vol. 66 Issue 6, p3393-3425, 33p
- Publication Year :
- 2024
-
Abstract
- We study learning from unlabeled data distributed across clients in a federated fashion where raw data do not leave the corresponding devices. We develop a K-means clustering algorithm within this federated setting where the local datasets are clustered at the clients, and a server generates the global clusters after aggregating the local ones. Given the importance of initialization on the federated K ̲ -means algorithm (FKM), our objective is to find better initial centroids by utilizing the local data stored on each client. To this end, we start the centroid initialization at the clients, rather than at the server, since the server initially lacks any preliminary insight into the clients' data. The clients first select their local initial clusters and subsequently share their clustering information (including cluster centroids and sizes) with the server. The server then employs a greedy algorithm to determine the global initial centroids based on the information received from the clients. We refer to this idea as G-FKM. Numerical results obtained from both synthetic and public datasets demonstrate that our proposed algorithm demonstrates accelerated convergence, exhibiting reduced within-cluster sum of squares ( WCSS ¯ ) and higher adjusted Rand Index compared to three distinct federated K-means variants. This improvement comes at a relatively low cost of sending limited additional information from the clients to the server, rather than conducting the initialization entirely at the server. Furthermore, we have also observed that the proposed algorithm performs better than the centralized algorithm for cases where the data distribution across the clients is highly skewed. [ABSTRACT FROM AUTHOR]
- Subjects :
- CENTROID
K-means clustering
GREEDY algorithms
SUM of squares
DATA distribution
Subjects
Details
- Language :
- English
- ISSN :
- 02191377
- Volume :
- 66
- Issue :
- 6
- Database :
- Complementary Index
- Journal :
- Knowledge & Information Systems
- Publication Type :
- Academic Journal
- Accession number :
- 177538829
- Full Text :
- https://doi.org/10.1007/s10115-024-02066-x