Post Training Weight Compression with Distribution-based Filter-wise Quantization Step

Authors :: Daisuke Miyashita
Jun Deguchi
Asuka Maki
Shinichi Sasaki
Source :: COOL CHIPS
Publication Year :: 2019
Publisher :: IEEE, 2019.
Abstract: Quantization of models with lower bit precision is a promising method to develop lower-power and smaller-area neural network hardware. However, 4- or lower bit quantization usually requires additional retraining with labeled dataset for backpropagation to improve test accuracy. In this paper, we propose a quantization scheme with distribution-based filter-wise quantization step without labeled dataset. ResNet-50 model with 8-bit activation and 3.04-bit weight precision quantized with the proposed techniques achieves top-1 inference accuracy of 74.3 % on ImageNet.

Subjects :: Quantization (physics)
Distribution (mathematics)
Neural network hardware
Computer science
Post training
Compression (functional analysis)
Inference
Filter (signal processing)
Algorithm
Backpropagation

Database :: OpenAIRE
Journal :: 2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)
Accession number :: edsair.doi...........35cfe27cfed55eaa1fa3d4dabebddb6b

Tools