Back to Search Start Over

Parallel GEMM-based convolution for deep learning on multicore RISC-V processors.

Authors :
Ramírez, Cristian
Castelló, Adrián
Martínez, Héctor
Quintana-Ortí, Enrique S.
Source :
Journal of Supercomputing. Jun2024, Vol. 80 Issue 9, p12623-12643. 21p.
Publication Year :
2024

Abstract

We address the efficient implementation of the convolution operator on the GAP8 parallel ultra-low power platform (PULP), a heterogeneous multi-core processor equipped with a fabric controller (FC); a cluster of eight compute cores; and a four-level memory hierarchy with scratchpads instead of conventional, hardware-assisted cache memories. Our solution for this platform transforms the convolution into a general matrix–matrix multiplication (gemm) via the lowering approach, demonstrating that it is possible to attain reasonable performance on the GAP8 by carefully adapting techniques such as tiling and loop parallelism, which are mainstream in the multi-threaded, cache-aware realization of gemm. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09208542
Volume :
80
Issue :
9
Database :
Academic Search Index
Journal :
Journal of Supercomputing
Publication Type :
Academic Journal
Accession number :
177648344
Full Text :
https://doi.org/10.1007/s11227-024-05927-y