Back to Search Start Over

Intrinsic dimension estimation based on local adjacency information

Authors :
Youlong Yang
Benchong Li
Haiquan Qiu
Source :
Information Sciences. 558:21-33
Publication Year :
2021
Publisher :
Elsevier BV, 2021.

Abstract

The intrinsic dimension (ID) of a data set is crucial for data processing, especially for high-dimensional data sets. In order to obtain an accurate ID estimate, two neighborhoods of sample points with a radius ratio of k are considered. The ratio of the number of sample points contained in the two neighborhoods is denoted as q ∊ . When the data set is located on a d-dimensional submanifold in R D , the expected value of q ∊ is k d . Based on this consideration, we redefine the adjacency matrix by using the local adjacency information of sample points and propose a new ID estimation method known as ID(k). The ID(k) algorithm contains only one parameter, the scaling ratio k, and we outline the criterion through which the user can select an appropriate k value. We demonstrate the convergence of the new method both theoretically and experimentally. Experimental results from artificial and real data sets show that the estimates obtained by this new ID(k) method are closer to the true intrinsic dimension than those derived using similar methods.

Details

ISSN :
00200255
Volume :
558
Database :
OpenAIRE
Journal :
Information Sciences
Accession number :
edsair.doi...........3b0dd53187ec557c9957d77e200d2fa6