1. Partitioned K-nearest neighbor local depth for scalable comparison-based learning
- Author
-
Baron, Jacob D., Darling, R. W. R., Davis, J. Laylon, and Pettit, R.
- Subjects
Computer Science - Data Structures and Algorithms ,Mathematics - Combinatorics ,Mathematics - Probability ,90C35 ,F.2.2 - Abstract
A triplet comparison oracle on a set $S$ takes an object $x \in S$ and for any pair $\{y, z\} \subset S \setminus \{x\}$ declares which of $y$ and $z$ is more similar to $x$. Partitioned Local Depth (PaLD) supplies a principled non-parametric partitioning of $S$ under such triplet comparisons but needs $O(n^2 \log{n})$ oracle calls and $O(n^3)$ post-processing steps. We introduce Partitioned Nearest Neighbors Local Depth (PaNNLD), a computationally tractable variant of PaLD leveraging the $K$-nearest neighbors digraph on $S$. PaNNLD needs only $O(n K \log{n})$ oracle calls, by replacing an oracle call by a coin flip when neither $y$ nor $z$ is adjacent to $x$ in the undirected version of the $K$-nearest neighbors digraph. By averaging over randomizations, PaNNLD subsequently requires (at best) only $O(n K^2)$ post-processing steps. Concentration of measure shows that the probability of randomization-induced error $\delta$ in PaNNLD is no more than $2 e^{-\delta^2 K^2}$., Comment: 27 pages, 2 figures
- Published
- 2021