Back to Search
Start Over
Optimizing Matrix Transpose on Torus Interconnects.
- Source :
- Euro-par 2010 - Parallel Processing (9783642152900); 2010, p440-451, 12p
- Publication Year :
- 2010
-
Abstract
- Matrix transpose is a fundamental matrix operation that arises in many scientific and engineering applications. Communication is the main bottleneck in performing matrix transpose on most multi-processor systems. In this paper, we focus on torus interconnection networks and propose application-level routing techniques that improve load balancing, resulting in better performance. Our basic idea is to route the data via carefully selected intermediate nodes. However, directly employing this technique may lead to worsening of the congestion. We overcome this issue by employing the routing only for selected set of communicating pairs. We implement our optimizations on the Blue Gene/P supercomputer and demonstrate up to 35% improvement in performance. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISBNs :
- 9783642152900
- Database :
- Complementary Index
- Journal :
- Euro-par 2010 - Parallel Processing (9783642152900)
- Publication Type :
- Book
- Accession number :
- 76758407
- Full Text :
- https://doi.org/10.1007/978-3-642-15291-7_41