Back to Search Start Over

A None-Sparse Inference Accelerator that Distills and Reuses the Computation Redundancy in CNNs.

Authors :
Ying Wang
Shengwen Liang
Huawei Li
Xiaowei Li
Source :
DAC: Annual ACM/IEEE Design Automation Conference; 2019, Issue 56, p73-78, 6p
Publication Year :
2019

Abstract

Prior research on energy-efficient Convolutional Neural Network (CNN) inference accelerators mostly focus on exploiting the model sparsity, i.e., zero patterns in weight and activations, to reduce the on-chip storage and computation overhead. In this work, we found in addition to zero patterns, a larger group of repetitive patterns and values exists in the working-set of CNN inference task, which is defined as computation redundancy and induces unnecessary performance and storage overhead in CNN accelerators. Based on this observation, we proposed a redundancy-free architecture that detects and eliminates the repetitive computation and storage patterns in CNN for more efficient network inference. The architecture consists of two parts: the off-line parameter analyzer that extracts the repetitive patterns in the 3D tensor of parameters, and the dataflow accelerator. The proposed accelerator at first preprocesses the weight patterns and the dynamically generated activations, and then cache these intermediate results in special P²-cache banks for further usage in convolution or full-connection stage. It is evaluated in experiments that the proposed Cavoluche architecture removes up to 89% of the repetitive operations from the layer inference process and reduce 77% of on-chip storage space to store both redundancy-free weight and activations. It is seen in experiments that the implementation of Cavoluche outperforms the state-of-the-art mobile GPGPU in both performance and energy-efficiency. When compared to the latest sparsity base accelerators, Cavoluche also achieves better operation elimination effects. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0738100X
Issue :
56
Database :
Complementary Index
Journal :
DAC: Annual ACM/IEEE Design Automation Conference
Publication Type :
Conference
Accession number :
155539406
Full Text :
https://doi.org/10.1145/3316781.3317749