Back to Search Start Over

Distribution of Euclidean Distances Between Randomly Distributed Gaussian Points in n-Space

Authors :
Thirey, Benjamin
Hickman, Randal
Publication Year :
2015

Abstract

The curse of dimensionality is a common phenomenon which affects analysis of datasets characterized by large numbers of variables associated with each point. Problematic scenarios of this type frequently arise in classification algorithms which are heavily dependent upon distances between points, such as nearest-neighbor and $k$-means clustering. Given that contributing variables follow Gaussian distributions, this research derives the probability distribution that describes the distances between randomly generated points in n-space. The theoretical results are extended to examine additional properties of the distribution as the dimension becomes arbitrarily large. With this distribution of distances between randomly generated points in arbitrarily large dimensions, one can then determine the significance of distance measurements between any collection of individual points.<br />Comment: 13 pages, 4 figures

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.1508.02238
Document Type :
Working Paper