Effective Data Reduction Using Discriminative Feature Selection Based on Principal Component Analysis

Authors :: Faith Nwokoma
Justin Foreman
Cajetan M. Akujuobi
Source :: Machine Learning and Knowledge Extraction, Vol 6, Iss 2, Pp 789-799 (2024)
Publication Year :: 2024
Publisher :: MDPI AG, 2024.
Abstract: Effective data reduction must retain the greatest possible amount of informative content of the data under examination. Feature selection is the default for dimensionality reduction, as the relevant features of a dataset are usually retained through this method. In this study, we used unsupervised learning to discover the top-k discriminative features present in the large multivariate IoT dataset used. We used the statistics of principal component analysis to filter the relevant features based on the ranks of the features along the principal directions while also considering the coefficients of the components. The selected number of principal components was used to decide the number of features to be selected in the SVD process. A number of experiments were conducted using different benchmark datasets, and the effectiveness of the proposed method was evaluated based on the reconstruction error. The potency of the results was verified by subjecting the algorithm to a large IoT dataset, and we compared the performance based on accuracy and reconstruction error to the results of the benchmark datasets. The performance evaluation showed consistency with the results obtained with the benchmark datasets, which were of high accuracy and low reconstruction error.

Subjects :: feature selection
feature reduction
principal component analysis (PCA)
singular value decomposition
Computer engineering. Computer hardware
TK7885-7895

Full Text Access

Tools