Back to Search Start Over

A 0.078 pJ/SOP Unstructured Sparsity-Aware Spiking Attention/Convolution Processor with 3D Compute Array

Authors :
Fang, Chaoming
Shen, Ziyang
Zhao, Shiqi
Wang, Chuanqing
Tian, Fengshi
Yang, Jie
Sawan, Mohamad
Fang, Chaoming
Shen, Ziyang
Zhao, Shiqi
Wang, Chuanqing
Tian, Fengshi
Yang, Jie
Sawan, Mohamad
Publication Year :
2024

Abstract

Large-scale SNNs, which employ advanced network architectures such as Transformers, have shown performance matching that of their ANN counterparts [1]. The inherent high sparsity and the use of accumulation (AC) features in SNNs offer the promise of energy-efficient computing. However, the energy efficiency of SNNs is highly dependent on the data sparsity level and can deteriorate, falling even below the efficiency of resource-intensive ANNs when sparsity is low. Previous work [2] mitigates this problem by using heterogeneous SNN and CNN cores but incurs increased area and power consumption due to costly MAC array implementation. Therefore, a spiking-only accelerator with enhanced energy efficiency across all sparsity levels is highly desired. To achieve this goal, three critical challenges need to be addressed as shown in Fig. 1: 1) Redundant memory access of weights and partial sums across different time steps leads to significant power consumption and memory space waste. 2) Throughput degradation when exploiting unstructured spike sparsity. Fetching irregularly distributed non-zero spikes and corresponding weights one by one incurs long latency, compromising the benefits of high sparsity in large-scale SNNs. 3) One-size-fits-all scheduling leads to imbalanced efficiency across different operators. A homogeneous scheduler cannot be applicable to all operators at maximal efficiency because they have distinctive computational characteristics. © 2024 IEEE.

Details

Database :
OAIster
Notes :
English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1452721393
Document Type :
Electronic Resource