Back to Search Start Over

Two-Stage Column Block Parallel LU Factorization Algorithm

Authors :
Rongteng Wu
Xiaohong Xie
Source :
IEEE Access, Vol 8, Pp 2645-2655 (2020)
Publication Year :
2020
Publisher :
IEEE, 2020.

Abstract

Parallel computing is increasingly important in computer architectures, parallel architecture has become ubiquitous in our everyday lives. Novel architectures and programming models pose new challenges to algorithm design and system software development. This paper presents a two-stage column block parallel LU factorization algorithm for multiple-processor architectures. Any given matrix is first partitioned into large blocks, and then, every large block is partitioned into a number of small blocks according to the number of processors. Finally, the small column blocks are allocated to processors in an orderly “serpentine arrangement.” Each iteration of the column block parallel LU factorization is separated into two stages of operation. In the first stage, the first-step factorization operation is processed in advance and nonblocking communication is used to reduce the processor idle and waiting time and improve parallelism. In the second stage, the large blocks are used to satisfy more powerful processors, such as GPUs, which require more data to exploit their computing capabilities. Experiments are conducted on a multicore system and multi-GPU system with different configurations to test the algorithm's performance. Compared with other related column block parallel LU factorizations, the two-stage algorithm exhibits better load balancing and parallel execution time performance.

Details

Language :
English
ISSN :
21693536
Volume :
8
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.f89c8790e54a0abc4e070575167a93
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2019.2962355