Start Over

Optimizing matrix-matrix multiplication on intel’s advanced vector extensions multicore processor

Authors :: Tomonobu Senjyu
Ashraf Mohamed Hemeida
Mahmoud A. Saber
Mountasser M.M. Mahmoud
Salem Alkhalaf
Ayman M. Bahaa Eldin
Somaia Awad Hassan
Abdullah H. Alayed
Source :: Ain Shams Engineering Journal. 11:1179-1190
Publication Year :: 2020
Publisher :: Elsevier BV, 2020.
Abstract: This paper is focused on Intel Advanced Vector Extension (AVX) which has been borne of the modern developments in AMD processors and Intel itself. Said prescript processes a chunk of data both individually and altogether. AVX is supporting variety of applications such as image processing. Our goal is to accelerate and optimize square single-precision matrix multiplication from 2080 to 4512, i.e. big size ranges. Our optimization is designed by using AVX instruction sets, OpenMP parallelization, and memory access optimization to overcome bandwidth limitations. This paper is different from other papers by concentrating on several main technique and the results therein. Making parallel implementation guidelines of said algorithms, where the target architecture’s characteristics need to be taken into consideration when said algorithms are applied are presented. This work has a comparative study of using most popular compilers: Intel C++ compiler 17.0 over Microsoft Visual Studio C++ compiler 2015. Additionally, a comparative study between single-core and multicore platforms has been examined. The obtained results of the proposed optimized algorithms are achieved a performance improvement of 71%, 59%, and 56% for C = A.B, C = A.BT, and C = AT.B separately compared with results that are achieved by implementing the latest Intel Math Kernel Library 2017 SGEMV subroutines.

Subjects :: Multi-core processor
Computer science
020209 energy
Subroutine
020208 electrical & electronic engineering
General Engineering
02 engineering and technology
Parallel computing
computer.software_genre
Microsoft Visual Studio
Matrix multiplication
Instruction set
Kernel (linear algebra)
0202 electrical engineering, electronic engineering, information engineering
Compiler
Performance improvement
computer

Details

ISSN :: 20904479
Volume :: 11
Database :: OpenAIRE
Journal :: Ain Shams Engineering Journal
Accession number :: edsair.doi...........d3f8e7012afb489e498d9660e43bede0

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Optimizing matrix-matrix multiplication on intel’s advanced vector extensions multicore processor

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Optimizing matrix-matrix multiplication on intel’s advanced vector extensions multicore processor

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources