Back to Search Start Over

A High Performance Multi-Bit-Width Booth Vector Systolic Accelerator for NAS Optimized Deep Learning Neural Networks.

Authors :
Huang, Mingqiang
Liu, Yucen
Man, Changhai
Li, Kai
Cheng, Quan
Mao, Wei
Yu, Hao
Source :
IEEE Transactions on Circuits & Systems. Part I: Regular Papers. Sep2022, Vol. 69 Issue 9, p3619-3631. 13p.
Publication Year :
2022

Abstract

Multi-bit-width convolutional neural network (CNN) maintains the balance between network accuracy and hardware efficiency, thus enlightening a promising method for accurate yet energy-efficient edge computing. In this work, we develop state-of-the-art multi-bit-width accelerator for NAS Optimized deep learning neural networks. To efficiently process the multi-bit-width network inferencing, multi-level optimizations have been proposed. Firstly, differential Neural Architecture Search (NAS) method is adopted for the high accuracy multi-bit-width network generation. Secondly, hybrid Booth based multi-bit-width multiply-add-accumulation (MAC) unit is developed for data processing. Thirdly, vector systolic array is proposed for effectively accelerating the matrix multiplications. With vector-style systolic dataflow, both the processing time and logic resources consumption can be reduced when compared with the classical systolic array. Finally, The proposed multi-bit-width CNN acceleration scheme has been practically deployed on FPGA platform of Xilinx ZCU102. Average performance on accelerating the full NAS optimized VGG16 network is 784.2 GOPS, and peek performance of the convolutional layer can reach as high as 871.26 GOPS for INT8, 1676.96 GOPS for INT4, and 2863.29 GOPS for INT2 respectively, which is among the best results in previous CNN accelerator benchmarks. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15498328
Volume :
69
Issue :
9
Database :
Academic Search Index
Journal :
IEEE Transactions on Circuits & Systems. Part I: Regular Papers
Publication Type :
Periodical
Accession number :
158869386
Full Text :
https://doi.org/10.1109/TCSI.2022.3178474