Start Over

Optimization of Scatter Network Architectures and Bank Allocations for Sparse CNN Accelerators

Authors :: Sunwoo Kim
Sungkyung Park
Chester Sungchung Park
Source :: IEEE Access, Vol 10, Pp 85864-85879 (2022)
Publication Year :: 2022
Publisher :: IEEE, 2022.
Abstract: Sparse convolutional neural network (SCNN) accelerators eliminate unnecessary computations and memory access by exploiting zero-valued activation pixels and filter weights. However, data movement between the multiplier array and accumulator buffer tends to be a performance bottleneck. Specifically, the scatter network, which is the core block of SCNN accelerators, delivers Cartesian products to the accumulator buffer, and certain products are not immediately delivered owing to bus contention. A previous SCNN-based architecture eliminates bus contention and improves the performance significantly by making use of different dataflows. However, it relies only on weight sparsity, and its performance is highly dependent on the workload. In this paper, we propose a novel scatter network architecture for SCNN accelerators. First, we propose network topologies (such as window and split queuing), which define the connection between the FIFOs and crossbar buses in the scatter network. Second, we investigate arbitration algorithms (such as fixed priority, round-robin, and longest-queue-first), which define the priorities of the products delivered to the accumulator buffer. However, the optimization of the scatter network architecture alone may not be able to provide sufficient performance gain since it does not help to reduce bus contention itself. In this paper, we propose a cubic-constrained bank allocation for the accumulator buffer, which reduces bus contention without any increase in the hardware area. Based on the results of cycle-accurate simulation, register-transfer-level (RTL) design, and logic synthesis, this study investigates the trade-off between the performance and complexity of SCNN accelerators. In detail, it is verified that, when the optimized SCNN accelerators are applied to AlexNet, the proposed scatter network architecture can remove most of the performance degradation due to bus contention, thereby improving the accelerator performance by 72%, with an area increase of 18%. It is also shown that the proposed bank allocation provides an additional performance gain of up to 31% when it is applied to SqueezeNet. The proposed scatter network architectures and bank allocation can eliminate bus contention in most Cartesian product-based accelerators, regardless of the workload, without changing accelerator components other than the scatter network.

Subjects :: Accelerator
convolutional neural networks (CNNs)
cycle-accurate simulator
data compression
dataflow
network on a chip (NoC)
Electrical engineering. Electronics. Nuclear engineering
TK1-9971

Details

Language :: English
ISSN :: 21693536
Volume :: 10
Database :: Directory of Open Access Journals
Journal :: IEEE Access
Publication Type :: Academic Journal
Accession number :: edsdoj.7a74c275913a4a02b4dc095fcaab1df8
Document Type :: article
Full Text :: https://doi.org/10.1109/ACCESS.2022.3199010

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Optimization of Scatter Network Architectures and Bank Allocations for Sparse CNN Accelerators

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Optimization of Scatter Network Architectures and Bank Allocations for Sparse CNN Accelerators

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources