Back to Search Start Over

基于申威众核架构的分组卷积计算加速与优化.

Authors :
王鑫
张铭
Source :
Application Research of Computers / Jisuanji Yingyong Yanjiu. Jun2023, Vol. 40 Issue 6, p1745-1749. 5p.
Publication Year :
2023

Abstract

In order to solve the problems of high computational complexity, large computational cost and large number of parameters, this paper proposed the parallel group convolution algorithm based on the domestic SW26010P multi-core processor. The core idea was to use the unique data layout, through the multi-core mapping processing, parallel computing. Experimental results show that compared with single-core serial algorithm, the proposed parallel group convolution algorithm can achieve the highest speed-up ratio of 79.5 and the maximum effective computing power of 186.7MFLOPS. After data parallel optimization of the parallel group convolution algorithm by SIMD instruction, the algorithm obtains the highest speed-up ratio of 10.2 compared with the parallel group convolution algorithm before optimization. [ABSTRACT FROM AUTHOR]

Details

Language :
Chinese
ISSN :
10013695
Volume :
40
Issue :
6
Database :
Academic Search Index
Journal :
Application Research of Computers / Jisuanji Yingyong Yanjiu
Publication Type :
Academic Journal
Accession number :
169823958
Full Text :
https://doi.org/10.19734/j.issn.1001-3695.2022.10.0559