Start Over

LHC: A Low-Power Heterogeneous Computing Method on Neural Network Accelerator

Authors :: Cheng Gong
Kunpeng Xie
Fangxin Liu
Tao Li
Shusheng Liu
Ye Lu
Source :: ICPADS
Publication Year :: 2019
Publisher :: IEEE, 2019.
Abstract: Accelerators can achieve high performance and low energy consumption in training or inference of neural networks. If the Non-Neural Network (Non-NN) algorithms with large amount of computation could make full use of the accelerators, it is possible to speed up its implementation, reduce energy consumption, and achieve load balancing, especially on mobile devices equipped with accelerators. However, accelerators are dedicated to neural network calculations, so that other Non-NN algorithms have difficulty in using their advantages. Furthermore, many hardware-specific restrictions have become the obstacles, such as constrained precision of operands and limited computation scale. In this paper, we propose a method named Low-power Heterogeneous Computing (LHC) to bridge the gap between Non-NN algorithms and NN accelerators. Firstly, we analyze the general principle of the accelerator and reveal the calculation model of the accelerator. To hide the details of the underlying neural network library, we extract some operators from the limited number of types of neural network computation they support. We encapsulate the low-level library, extract operators suitable for general algorithms, and implement some more advanced operators that can adapt to the constrained hardware conditions. These operators could facilitate programmers to implement some Non-NN algorithms. In the aspect of the algorithm, we extract the computationally intensive parts of the Non-NN algorithm and deploy these computational tasks on the accelerator by calling the operators. To verify our method, we implement three Non-NN algorithms by using operators and adjusting these algorithms, include Grid-based Motions Statistics, k-Nearest Neighbors, and k-Means, on a specific accelerator, Cambricon-1A. The experimental results show that the energy consumption of calculation is reduced by up to 5.4x, compared with the CPU baseline. Our method can be further applied to other similar accelerators.

Subjects :: 010302 applied physics
Speedup
Artificial neural network
Computer science
Computation
Symmetric multiprocessor system
02 engineering and technology
Energy consumption
Load balancing (computing)
Grid
01 natural sciences
020202 computer hardware & architecture
Computer engineering
0103 physical sciences
0202 electrical engineering, electronic engineering, information engineering
Central processing unit

Details

Database :: OpenAIRE
Journal :: 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS)
Accession number :: edsair.doi...........b18a636de26081c3db1726fc97def70d
Full Text :: https://doi.org/10.1109/icpads47876.2019.00053