Back to Search Start Over

Efficient Random Sampling -- Parallel, Vectorized, Cache-Efficient, and Online

Authors :
Sanders, Peter
Lamm, Sebastian
Hübschle-Schneider, Lorenz
Schrade, Emanuel
Dachsbacher, Carsten
Source :
ACM Transactions on Mathematical Software (TOMS), Volume 44, Issue 3 (April 2018), pages 29:1-29:14
Publication Year :
2016

Abstract

We consider the problem of sampling $n$ numbers from the range $\{1,\ldots,N\}$ without replacement on modern architectures. The main result is a simple divide-and-conquer scheme that makes sequential algorithms more cache efficient and leads to a parallel algorithm running in expected time $\mathcal{O}(n/p+\log p)$ on $p$ processors, i.e., scales to massively parallel machines even for moderate values of $n$. The amount of communication between the processors is very small (at most $\mathcal{O}(\log p)$) and independent of the sample size. We also discuss modifications needed for load balancing, online sampling, sampling with replacement, Bernoulli sampling, and vectorization on SIMD units or GPUs.

Details

Database :
arXiv
Journal :
ACM Transactions on Mathematical Software (TOMS), Volume 44, Issue 3 (April 2018), pages 29:1-29:14
Publication Type :
Report
Accession number :
edsarx.1610.05141
Document Type :
Working Paper
Full Text :
https://doi.org/10.1145/3157734