Back to Search Start Over

SemiMap: A Semi-Folded Convolution Mapping for Speed-Overhead Balance on Crossbars

Authors :
Jing Pei
Xing Hu
Xin Ma
Guanrui Wang
Lei Deng
Ling Liang
Liang Chang
Yuan Xie
Guoqi Li
Liu Liu
Source :
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. 39:117-130
Publication Year :
2020
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2020.

Abstract

Crossbar architecture has been widely used in neural network (NN) accelerators, involving conventional and emerging devices. It performs well on the fully connected layer through efficient vector–matrix multiplication. Whereas, the advantages degrade on the convolutional layer with huge data reuse, since the execution speed and resource overhead are imbalanced when using existing fully unfolded or fully folded mapping strategy. To address this issue, we propose a novel semi-folded mapping (SemiMap) framework for implementing the convolution on crossbars. It simultaneously folds the physical resources along the row dimension of feature maps (FMs) and unfolds them along the column dimension. The former reduces the resource overhead, and the latter maintains the parallelism. An FM slicing scheme is further proposed to enable the processing of large-size image. Via our mapping framework, a row-by-row streaming pipeline for intraimage dataflow and periodical pipeline for interimage dataflow are easy to be obtained. To validate the idea, we build a many-crossbar architecture with several designs to guarantee the overall functionality and performance. Based on the measurement data of a fabricated chip, a mapping compiler and a cycle-accurate simulator are developed for the hardware simulation of large-scale networks. We evaluate the proposed SemiMap on various convolutional NNs across different network scale. ${>} 35 {\times }$ resource saving and several hundred times cycle reduction are demonstrated compared to the existing fully unfolded and fully folded strategies, respectively. This paper jumps out of the current extreme mapping schemes, and provides a balanced solution on how to efficiently deploy the computational graphs with data reuse on many-crossbar architecture.

Details

ISSN :
19374151 and 02780070
Volume :
39
Database :
OpenAIRE
Journal :
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Accession number :
edsair.doi...........a7e792256d09681806acf124ec01d978