1. Exploring High Dimension Large Data Correlation Analysis with Mutual Information and Application
- Author
-
Xiao-min Wang, Wen-yan Zhu, Yu-shan Jiang, and Dong-Kai Zhang
- Subjects
Data grid ,Computer science ,Data correlation ,Nonparametric statistics ,Partition problem ,Entropy (information theory) ,Data mining ,Mutual information ,computer.software_genre ,Grid ,computer ,Information coefficient - Abstract
Applying for information entropy theory, we present a measure of dependence for multi-variables relationships: the high dimensional maximal mutual information coefficient (HMIC). It is a kind of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships in large data sets which generalizes the maximum information coefficient (MIC) measurement in mutual variables. To decreasing the complexity of the HMIC computing, the improved uniform grid is proposed by data grid idea. At the same time, some optimal single axis partition algorithm (SAR) is built to ensure the feasible of the HMIC measurement. Finally we apply the HMIC to analysis the data sets of physical measurement among college students.
- Published
- 2016
- Full Text
- View/download PDF