Back to Search Start Over

Performance Analysis of Matrix Multiplication for Deep Learning on the Edge

Authors :
Ramírez, Cristian
Castelló, Adrián
Martínez, Héctor
Quintana-Ortí, Enrique S.
Source :
High Performance Computing. ISC High Performance 2022 International Workshops. ISC High Performance 2022. Lecture Notes in Computer Science, vol 13387. Springer, Cham
Publication Year :
2024

Abstract

The devices designed for the Internet-of-Things encompass a large variety of distinct processor architectures, forming a highly heterogeneous zoo. In order to tackle this, we employ a simulator to estimate the performance of the matrix-matrix multiplication (GEMM) kernel on processors designed to operate at the edge. Our simulator adheres to the modern implementations of GEMM, advocated by GotoBLAS2, BLIS, OpenBLAS, etc., to carefully account for the amount of data transfers across the memory hierarchy of different algorithmic variants of the kernel. %Armed with this tool, A small collection of experiments provide the necessary data to calibrate the simulator and deliver highly accurate estimations of the execution time for a given processor architecture.<br />Comment: 12 pages, 2 Tables, 6 Figures

Details

Database :
arXiv
Journal :
High Performance Computing. ISC High Performance 2022 International Workshops. ISC High Performance 2022. Lecture Notes in Computer Science, vol 13387. Springer, Cham
Publication Type :
Report
Accession number :
edsarx.2403.07731
Document Type :
Working Paper
Full Text :
https://doi.org/10.1007/978-3-031-23220-6_5