Start Over

CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP Architectures.

Authors :: Zou, Kaiwei
Wang, Ying
Cheng, Long
Qu, Songyun
Li, Huawei
Li, Xiaowei
Source :: IEEE Transactions on Computers. Jul2022, Vol. 71 Issue 7, p1626-1639. 14p.
Publication Year :: 2022
Abstract: Real-time inference of deep learning models on embedded and energy-efficient devices becomes increasingly desirable with the rapid growth of artificial intelligence on edge. Specifically, to achieve superb energy-efficiency and scalability, efficient parallelization of single-pass deep neural network (DNN) inference on chip multiprocessor (CMP) architectures is urgently required by many time-sensitive applications. However, as the number of processing cores scales up and the performance of cores has grown much fast, the on-chip inter-core data movement is prone to be a performance bottleneck for computation. To remedy this problem and further improve the performance of network inference, in this work, we introduce a communication-aware DNN parallelization technique called CAP, by exploiting the elasticity and noise-tolerance of deep learning algorithms on CMP. Moreover, in the hope that the conducted studies can provide new design values for real-time neural network inference on embedded chips, we also have evaluated the proposed approach on both multi-core Neural Network Accelerators (NNA) chips and general-purpose chip-multiprocessors. Our experimental results show that the proposed CAP can achieve 1.12×-1.65× system speedups and 1.14×-2.70× energy efficiency for different neural networks while maintaining the inference accuracy, compared to baseline approaches. [ABSTRACT FROM AUTHOR]

Details

Language :: English
ISSN :: 00189340
Volume :: 71
Issue :: 7
Database :: Academic Search Index
Journal :: IEEE Transactions on Computers
Publication Type :: Academic Journal
Accession number :: 157325217
Full Text :: https://doi.org/10.1109/TC.2021.3099688

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP Architectures.

Abstract

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP Architectures.

Abstract

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources