Back to Search
Start Over
DD-HDS: A Method for Visualization and Exploration of High-Dimensional Data
- Source :
- IEEE Transactions on Neural Networks, IEEE Transactions on Neural Networks, Institute of Electrical and Electronics Engineers, 2007, 18 (5), pp.1265-79, IEEE Transactions on Neural Networks, 2007, 18 (5), pp.1265-79, HAL
- Publication Year :
- 2007
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2007.
-
Abstract
- International audience; Mapping high-dimensional data in a low-dimensional space, for example, for visualization, is a problem of increasingly major concern in data analysis. This paper presents data-driven high-dimensional scaling (DD-HDS), a nonlinear mapping method that follows the line of multidimensional scaling (MDS) approach, based on the preservation of distances between pairs of data. It improves the performance of existing competitors with respect to the representation of high-dimensional data, in two ways. It introduces (1) a specific weighting of distances between data taking into account the concentration of measure phenomenon and (2) a symmetric handling of short distances in the original and output spaces, avoiding false neighbor representations while still allowing some necessary tears in the original distribution. More precisely, the weighting is set according to the effective distribution of distances in the data set, with the exception of a single user-defined parameter setting the tradeoff between local neighborhood preservation and global mapping. The optimization of the stress criterion designed for the mapping is realized by "force-directed placement" (FDP). The mappings of low- and high-dimensional data sets are presented as illustrations of the features and advantages of the proposed algorithm. The weighting function specific to high-dimensional data and the symmetric handling of short distances can be easily incorporated in most distance preservation-based nonlinear dimensionality reduction methods.
- Subjects :
- Clustering high-dimensional data
Computer Networks and Communications
Information Storage and Retrieval
02 engineering and technology
computer.software_genre
Set (abstract data type)
non-linear mapping
03 medical and health sciences
Imaging, Three-Dimensional
Data visualization
Artificial Intelligence
Multi Dimensional Scaling
Computer Graphics
0202 electrical engineering, electronic engineering, information engineering
Computer Simulation
Multidimensional scaling
[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]
030304 developmental biology
Mathematics
0303 health sciences
[SDV.BIBS] Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]
neighborhood visualization
business.industry
Dimensionality reduction
Nonlinear dimensionality reduction
General Medicine
Models, Theoretical
[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM]
Computer Science Applications
Weighting
Data set
high-dimensional data
Data Display
020201 artificial intelligence & image processing
Data mining
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]
business
computer
Algorithm
Algorithms
Software
Subjects
Details
- ISSN :
- 10459227
- Volume :
- 18
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Neural Networks
- Accession number :
- edsair.doi.dedup.....1cbc78246ada196b2869c684e13552b6