1. Software Clustering Using Automated Feature Subset Selection
- Author
-
Abdun Naser Mahmood, Mehmet A. Orgun, Sara Shahzad, Zubair Shah, and Rashid Naseem
- Subjects
Computer science ,business.industry ,Software architecture recovery ,Feature selection ,Software maintenance ,computer.software_genre ,Machine learning ,Feature model ,Software ,Feature (computer vision) ,Data mining ,Artificial intelligence ,Software system ,Cluster analysis ,business ,computer - Abstract
This paper proposes a feature selection technique for software clustering which can be used in the architecture recovery of software systems. The recovered architecture can then be used in the subsequent phases of software maintenance, reuse and re-engineering. A number of diverse features could be extracted from the source code of software systems, however, some of the extracted features may have less information to use for calculating the entities, which result in dropping the quality of software clusters. Therefore, further research is required to select those features which have high relevancy in finding associations between entities. In this article first we propose a supervised feature selection technique for unlabeled data, and then we apply this technique for software clustering. A number of feature subset selection techniques in software architecture recovery have been proposed. However none of them focus on automated feature selection in this domain. Experimental results on three software test systems reveal that our proposed approach produces results which are closer to the decompositions prepared by human experts, as compared to those discovered by the well-known K-Means algorithm.
- Published
- 2013