Back to Search Start Over

Square Matrix Multiplication Using CUDA on GP-GU.

Authors :
Jimale, Ali Olow
Ridzuan, Fakhitah
Wan Zainon, Wan Mohd Nazmee
Source :
Procedia Computer Science; 2019, Vol. 161, p398-405, 8p
Publication Year :
2019

Abstract

This paper focuses on matrix multiplication algorithm, particularly square parallel matrix multiplication using Computer Unified Device Architecture (CUDA) programming model with C programming language. Matrix multiplication is under the list of time-consuming problems that require s huge computational resources to improve its speedup. As many studies have shown, it is not easy to achieve high performance speedup in sequential matrix multiplication algorithm using larger input. The emphasis of this study is to propose a parallel algorithm to calculate the product of two square matrices with improved speedup performance compared to the sequential and OpenMP algorithms. In this research, biruni (super machine workstation) in the School of Computer Sciences, USM, Malaysia with General Purpose Graphics Processing Unit (GP-GU) was used to parallelize the matrix product algorithm. A comparison between parallel OpenMp versions and sequential algorithm with the proposed CUDA based algorithm of this research was carried out to evaluate the speedup performance of the proposed parallel CUDA based algorithm. The overall results show that CUDA based parallel matrix multiplication is approximately 400 times faster than sequential matrix multiplication and 4 times faster than OpenMp matrix multiplication algorithms, respectively. Therefore, the proposed parallel algorithm can help the researchers working with matrix multiplication application problems. It can also help mathematicians to easily calculate the product of any two matrices and obtain the result in a shorter time. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18770509
Volume :
161
Database :
Supplemental Index
Journal :
Procedia Computer Science
Publication Type :
Academic Journal
Accession number :
141238474
Full Text :
https://doi.org/10.1016/j.procs.2019.11.138