Back to Search
Start Over
Low Latency Implementations of CNN for Resource-Constrained IoT Devices.
- Source :
- IEEE Transactions on Circuits & Systems. Part II: Express Briefs; Dec2022, Vol. 69 Issue 12, p5124-5128, 5p
- Publication Year :
- 2022
-
Abstract
- Convolutional Neural Network (CNN) inference on a resource-constrained Internet-of-Things (IoT) device (i.e., ARM Cortex-M microcontroller) requires careful optimization to reduce the timing overhead. We propose two novel techniques to improve the computational efficiency of CNNs by targeting low-cost microcontrollers. Our techniques utilize on-chip memory and minimize redundant operations, yielding low-latency inference results on complex quantized models such as MobileNetV1. On the ImageNet dataset for per-layer quantization, we reduce inference latency and Multiply-and-Accumulate (MAC) per cycle by 22.4% and 22.9%, respectively, compared to the state-of-the-art mixed-precision CMix-NN library. On the CIFAR-10 dataset for per-channel quantization, we reduce inference latency and MAC per cycle by 31.7% and 31.3%, respectively. The achieved low-latency inference results can improve the user experience and save power budget in resource-constrained IoT devices. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 15497747
- Volume :
- 69
- Issue :
- 12
- Database :
- Complementary Index
- Journal :
- IEEE Transactions on Circuits & Systems. Part II: Express Briefs
- Publication Type :
- Academic Journal
- Accession number :
- 160688948
- Full Text :
- https://doi.org/10.1109/TCSII.2022.3205029