Back to Search Start Over

Integrated dimensionality reduction technique for mixed-type data involving categorical values.

Authors :
Hsu, Chung-Chian
Huang, Wei-Hao
Source :
Applied Soft Computing; Jun2016, Vol. 43, p199-209, 11p
Publication Year :
2016

Abstract

Dimensionality reduction is a useful technique to cope with high dimensionality of the real-world data. However, traditional methods were studied in the context of datasets with only numeric attributes. With the demand of analyzing datasets involving categorical attributes, an extension to the recent dimensionality-reduction technique t -SNE is proposed. The extension facilitates t -SNE to handle mixed-type datasets. Each attribute of the data is associated with a distance hierarchy which allows the distance between numeric values and between categorical values be measured in a unified manner. More importantly, domain knowledge regarding distance considering semantics embedded in categorical values can be specified via the hierarchy. Consequently, the extended t -SNE can project the high-dimensional, mixed data to a low-dimensional space with topological order which reflects user's intuition. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15684946
Volume :
43
Database :
Supplemental Index
Journal :
Applied Soft Computing
Publication Type :
Academic Journal
Accession number :
114574830
Full Text :
https://doi.org/10.1016/j.asoc.2016.02.015