1. Swallow: A Versatile Accelerator for Sparse Neural Networks.
- Author
-
Liu, Bosheng, Chen, Xiaoming, Han, Yinhe, and Xu, Haobo
- Subjects
- *
ARTIFICIAL neural networks , *MATRIX multiplications , *SPARSE matrices , *NETWORK performance , *DEGLUTITION , *ENERGY consumption - Abstract
Sparse neural networks (SNNs) are emerging as a promising technique for resource-limited intelligent embedded systems because of the compact model size and the uncompromised accuracy. Recently, most of the dedicated neural network accelerators are beginning to exploit the sparsity of neural network models for performance boost and energy saving. However, existing sparsity-aware accelerators fail to support both sparse weights and activations in neural networks or support them at the same time for both convolutional (Conv) layers and fully connected (FC) layers, which dominate the computational time of neural networks. In this article, we propose a novel sparsity-aware accelerator architecture, called Swallow, to sufficiently improve the inference performance by eliminating ineffectual weights and activations of neural networks. Swallow comprises: 1) a 2-D systolic architecture that fully utilizes the sparsity of both weights and activations in both Conv and FC layers and 2) a sparsity-aware dataflow which is optimized to reuse both weights and activations and to achieve high processing element (PE) utilization by sparse matrix multiplication tiling. Comprehensive evaluations based on a place-and-route process show that Swallow, with 614 GOP/s peak performance and 1.26-W power, outperforms a state-of-the-art sparsity-aware accelerator Cambricon-X by $1.32\times $ in term of energy efficiency. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF