1. Distribution of Euclidean Distances Between Randomly Distributed Gaussian Points in n-Space
- Author
-
Thirey, Benjamin and Hickman, Randal
- Subjects
Mathematics - Probability ,60D05 (Primary) 52A38, 53C65 (Secondary) - Abstract
The curse of dimensionality is a common phenomenon which affects analysis of datasets characterized by large numbers of variables associated with each point. Problematic scenarios of this type frequently arise in classification algorithms which are heavily dependent upon distances between points, such as nearest-neighbor and $k$-means clustering. Given that contributing variables follow Gaussian distributions, this research derives the probability distribution that describes the distances between randomly generated points in n-space. The theoretical results are extended to examine additional properties of the distribution as the dimension becomes arbitrarily large. With this distribution of distances between randomly generated points in arbitrarily large dimensions, one can then determine the significance of distance measurements between any collection of individual points., Comment: 13 pages, 4 figures
- Published
- 2015