Back to Search Start Over

Data Analysis for Information Discovery

Authors :
Alberto Amato
Vincenzo Di Lecce
Source :
Applied Sciences, Vol 13, Iss 6, p 3481 (2023)
Publication Year :
2023
Publisher :
MDPI AG, 2023.

Abstract

Artificial intelligence applications are becoming increasingly popular and are producing better results in many areas of research. The quality of the results depends on the quantity of data and its information content. In recent years, the amount of data available has increased significantly, but this does not always mean more information and therefore better results. The aim of this work is to evaluate the effects of a new data preprocessing method for machine learning. This method was designed for sparce matrix approximation, and it is called semi-pivoted QR approximation (SPQR). To best of our knowledge, it has never been applied to data preprocessing in machine learning algorithms. This method works as a feature selection algorithm, and in this work, an evaluation of its effects on the performance of an unsupervised clustering algorithm is proposed. The obtained results are compared to those obtained using, as preprocessing algorithm, principal component analysis (PCA). These two methods have been applied to various publicly available datasets. The obtained results show that the SPQR algorithm can achieve results comparable to those obtained using PCA without introducing any transformation of the original dataset.

Details

Language :
English
ISSN :
20763417
Volume :
13
Issue :
6
Database :
Directory of Open Access Journals
Journal :
Applied Sciences
Publication Type :
Academic Journal
Accession number :
edsdoj.b8fcdd7ab7054352bb56fe2c40412818
Document Type :
article
Full Text :
https://doi.org/10.3390/app13063481