Back to Search
Start Over
Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification
- Source :
- Neural Processing Letters. 51:105-127
- Publication Year :
- 2019
- Publisher :
- Springer Science and Business Media LLC, 2019.
-
Abstract
- Deep learning has made a real revolution in the embedded computing environment. Convolutional neural network (CNN) revealed itself as a reliable fit to many emerging problems. The next step, is to enhance the CNN role in the embedded devices including both implementation details and performance. Resources needs of storage and computational ability are limited and constrained, resulting in key issues we have to consider in embedded devices. Compressing (i.e., quantizing) the CNN network is a valuable solution. In this paper, Our main goals are: memory compression and complexity reduction (both operations and cycles reduction) of CNNs, using methods (including quantization and pruning) that don’t require retraining (i.e., allowing us to exploit them in mobile system, or robots). Also, exploring further quantization techniques for further complexity reduction. To achieve these goals, we compress a CNN model layers (i.e., parameters and outputs) into suitable precision formats using several quantization methodologies. The methodologies are: First, we describe a pruning approach, which allows us to reduce the required storage and computation cycles in embedded devices. Such enhancement can drastically reduce the consumed power and the required resources. Second, a hybrid quantization approach with automatic tuning for the network compression. Third, a K-means quantization approach. With a minor degradation relative to the floating-point performance, the presented pruning and quantization methods are able to produce a stable performance fixed-point reduced networks. A precise fixed-point calculations for coefficients, input/output signals and accumulators are considered in the quantization process.
- Subjects :
- 0209 industrial biotechnology
Contextual image classification
Computer Networks and Communications
Computer science
business.industry
General Neuroscience
Deep learning
Computation
Quantization (signal processing)
Computational intelligence
02 engineering and technology
Convolutional neural network
Reduction (complexity)
020901 industrial engineering & automation
Computer engineering
Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
Robot
020201 artificial intelligence & image processing
Artificial intelligence
business
Software
Subjects
Details
- ISSN :
- 1573773X and 13704621
- Volume :
- 51
- Database :
- OpenAIRE
- Journal :
- Neural Processing Letters
- Accession number :
- edsair.doi...........53235e57d29ebf9b196f10ee7f7acc8f