Back to Search Start Over

Efficiently running SpMV on long vector architectures

Authors :
Marc Casas
Constantino Gómez
Erich Focht
Filippo Mantovani
Universitat Politècnica de Catalunya. Doctorat en Arquitectura de Computadors
Barcelona Supercomputing Center
Source :
PPoPP, UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC)
Publication Year :
2021
Publisher :
Association for Computing Machinery (ACM), 2021.

Abstract

Sparse Matrix-Vector multiplication (SpMV) is an essential kernel for parallel numerical applications. SpMV displays sparse and irregular data accesses, which complicate its vectorization. Such difficulties make SpMV to frequently experiment non-optimal results when run on long vector ISAs exploiting SIMD parallelism. In this context, the development of new optimizations becomes fundamental to enable high performance SpMV executions on emerging long vector architectures. In this paper, we improve the state-of-the-art SELL-C-s sparse matrix format by proposing several new optimizations for SpMV. We target aggressive long vector architectures like the NEC Vector Engine. By combining several optimizations, we obtain an average 12% improvement over SELL-C-s considering a heterogeneous set of 24 matrices. Our optimizations boost performance in long vector architectures since they expose a high degree of SIMD parallelism. The authors would like to acknowledge the support of NEC Corporation. This work is partially supported by the Spanish Ministry of Science and Technology through PID2019-107255GB project and by the Generalitat de Catalunya (contract 2017-SGR-1414). Marc Casas has been partially supported by the Spanish Ministry of Economy, Industry and Competitiveness under Ramon y Cajal fellowship number RYC-2017-23269.

Details

Language :
English
Database :
OpenAIRE
Journal :
PPoPP, UPCommons. Portal del coneixement obert de la UPC, Universitat Politècnica de Catalunya (UPC)
Accession number :
edsair.doi.dedup.....b98e14d595764c22a08eb6bce1a3ebef