Back to Search Start Over

CSViz: Class Separability Visualization for high-dimensional datasets.

Authors :
Cuesta, Marina
Lancho, Carmen
Fernández-Isabel, Alberto
Cano, Emilio L.
Martín De Diego, Isaac
Source :
Applied Intelligence; Jan2024, Vol. 54 Issue 1, p924-946, 23p
Publication Year :
2024

Abstract

Data visualization is an essential task during the lifecycle of any Data Science (DS) project, particularly during the Exploratory Data Analysis (EDA) for a correct data preparation and understanding. In classification problems, data visualization is useful for revealing the existence of class separability patterns within the dataset. This information is very valuable and can be later used during the process of building a Machine Learning (ML) model. High-Dimensional Data (HDD) arise as one of the biggest challenges in DS. HDD require special treatment since traditional visualization techniques, such as the scatterplot matrix (SPLOM), have limitations when dealing with them due to space restrictions. Other visualization methods involve dimensionality reduction techniques, which can lead to losing important information and reducing the interpretability of the data. In this paper, the Class Separability Visualization (CSViz) method is introduced as a new Visual Analytics (VA) approach to address the challenge of visualizing labeled HDD through subspaces. The proposed method enables an overview of the class separability offering a series of 2-Dimensional subspaces visualizations containing exclusive subsets of points of the original variables that encompass the most valuable and significant separable patterns. The proposed method is tested over 50 datasets with different characteristics providing promising results. In all cases, more than 90% of the data observations are shown with three plots or less. Hence, the presented CSViz significantly eases the EDA by reducing the number of plots to be inspected in a SPLOM and thus, the amount of time invested in it. CSViz graphical abstract [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0924669X
Volume :
54
Issue :
1
Database :
Complementary Index
Journal :
Applied Intelligence
Publication Type :
Academic Journal
Accession number :
174800991
Full Text :
https://doi.org/10.1007/s10489-023-05149-4