Back to Search Start Over

Algorithmic and language-based optimization of Marsa-LFIB4 pseudorandom number generator using OpenMP, OpenACC and CUDA.

Authors :
Stpiczyński, Przemysław
Source :
Journal of Parallel & Distributed Computing. Mar2020, Vol. 137, p238-245. 8p.
Publication Year :
2020

Abstract

The aim of this paper is to present new high-performance implementations of Marsa-LFIB4 which is an example of high-quality multiple recursive pseudorandom number generators. We propose an algorithmic approach that combines language-based vectorization techniques together with a new divide-and-conquer parallel method that exploits a special sparse structure of the matrix obtained from the recursive formula that defines the generator. Our portable OpenACC implementation achieves the performance comparable to those achieved by our CUDA-based and OpenMP-based implementations on GPUs and multicore CPUs, respectively. • New algorithmic approach for vectorization and parallelization of recursive PRNGs. • Methodology and guidelines for optimizing recursive computations on GPUs. • Case study how to use shared memory to reduce references to global memory. • Comparison of language-based optimization techniques available in CUDA and OpenACC. • Our CUDA and OpenACC implementations achieve excellent performance on modern GPUs. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
07437315
Volume :
137
Database :
Academic Search Index
Journal :
Journal of Parallel & Distributed Computing
Publication Type :
Academic Journal
Accession number :
141197294
Full Text :
https://doi.org/10.1016/j.jpdc.2019.12.004