Back to Search Start Over

Intrinsic Dimensional Correlation Discretization for Mining Task

Authors :
Hong Wen Song
Yu Sang
Jun Zhao
Source :
Applied Mechanics and Materials. 404:548-554
Publication Year :
2013
Publisher :
Trans Tech Publications, Ltd., 2013.

Abstract

Discretization is a necessary pre-processing step of the mining task, and a way of performance improvement for many machine learning algorithms. Existing techniques mainly focus on 1-dimension discretization in lower dimensional data space. In this paper, we present an intrinsic dimensional correlation discretization technique in high-dimensional data space. The approach estimates the intrinsic dimensionality (ID) of the data by using maximum likelihood estimation (MLE). Further, we project data onto eigenspace of the estimated lower ID by using principle component analysis (PCA) that can discover the potential correlation structure in the multivariate data. Thus, all the dimensions of the data can be transformed into new independent eigenspace of the ID, and each dimension can be discretized separately in the eigenspace based on the promising Bayes discretization model by using outstanding MODL discretization method. We design a heuristic framework to find better discretization scheme. Our approach demonstrates that there is a significantly improvement on the mean learning accuracy of the classifiers than traditional discretization methods.

Details

ISSN :
16627482
Volume :
404
Database :
OpenAIRE
Journal :
Applied Mechanics and Materials
Accession number :
edsair.doi...........1e91488e58c27718d20346bdb54e7f6e