Back to Search Start Over

Computationally Efficient Parallel Matrix-Matrix Multiplication on the Torus.

Authors :
Hutchison, David
Kanade, Takeo
Kittler, Josef
Kleinberg, Jon M.
Mattern, Friedemann
Mitchell, John C.
Naor, Moni
Nierstrasz, Oscar
Pandu Rangan, C.
Steffen, Bernhard
Sudan, Madhu
Terzopoulos, Demetri
Tygar, Doug
Vardi, Moshe Y.
Weikum, Gerhard
Labarta, Jesús
Joe, Kazuki
Sato, Toshinori
Zekri, Ahmed S.
Sedukhin, Stanislav G.
Source :
High-Performance Computing; 2008, p219-226, 8p
Publication Year :
2008

Abstract

In this paper, we represent the computation space of the (n×n)-matrix multiplication problem C=C+A·B as a 3D torus. All possible time-minimal scheduling vectors needed to activate the computations inside the corresponding 3D index points at each step of computing are determined. Using the projection method to allocate the scheduled computations to the processing elements, the resulting array processor that minimizes the computing time is a 2D torus with n×n processing elements. For each optimal time scheduling function, three optimal array allocations are obtained from projection. All the resulting allocations of all the optimal scheduling vectors can be classified into three groups. In one group, matrix C remains and both matrices A and B are shifted between neighbor processors. The well-known Cannon's algorithm belongs to this group. In another group, matrix A remains and both matrices B and C are shifted. In the third group, matrix B remains while both matrices A and C are shifted. The obtained array processor allocations need n compute-shift steps to multiply n×n dense matrices. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISBNs :
9783540777038
Database :
Complementary Index
Journal :
High-Performance Computing
Publication Type :
Book
Accession number :
34228373
Full Text :
https://doi.org/10.1007/978-3-540-77704-5_19