1. Hybrid Strategy for Selecting Compact Set of Clustering Partitions
- Author
-
Katti Faceli, Vanessa Antunes, Tiemi C. Sakata, Marcilio C. P. de Souto, Universidade Federal de São Carlos/Sorocaba (UFSCar/Sorocaba), Universidade Federal de São Carlos, Sorocaba (UFSCar/Sorocaba), Universidade Federal de Sao Carlos - UFSCar (BRAZIL), Laboratoire d'Informatique Fondamentale d'Orléans (LIFO), Institut National des Sciences Appliquées - Centre Val de Loire (INSA CVL), and Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université d'Orléans (UO)
- Subjects
0209 industrial biotechnology ,Multiobjective optimisation ,Computer science ,Rand index ,Clustering Algorithm ,02 engineering and technology ,Multi-objective optimization ,Partition (database) ,Evolutionary computation ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,020901 industrial engineering & automation ,Compact space ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Cluster analysis ,Selection algorithm ,Algorithm ,Software - Abstract
International audience; The selection of the most appropriate clustering algorithm is not a straightforward task, given that there is no clustering algorithm capable of determining the actual groups present in any dataset. A potential solution is to use different clustering algorithms to produce a set of partitions (solutions) and then select the best partition produced according to a specified validation measure; these measures are generally biased toward one or more clustering algorithms. Nevertheless, in several real cases, it is important to have more than one solution as the output. To address these problems, we present a hybrid partition selection algorithm, HSS, which accepts as input a set of base partitions potentially generated from clustering algorithms with different biases and aims, to return a reduced and yet diverse set of partitions (solutions). HSS comprises three steps: (i) the application of a multiobjective algorithm to a set of base partitions to generate a Pareto Front (PF) approximation; (ii) the division of the solutions from the PF approximation into a certain number of regions; and (iii) the selection of a solution per region by applying the Adjusted Rand Index. We compare the results of our algorithm with those of another selection strategy, ASA. Furthermore, we test HSS as a post-processing tool for two clustering algorithms based on multiobjective evolutionary computing: MOCK and MOCLE. The experiments revealed the effectiveness of HSS in selecting a reduced number of partitions while maintaining their quality.
- Published
- 2020