Back to Search Start Over

Post Training Weight Compression with Distribution-based Filter-wise Quantization Step

Authors :
Daisuke Miyashita
Jun Deguchi
Asuka Maki
Shinichi Sasaki
Source :
COOL CHIPS
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Quantization of models with lower bit precision is a promising method to develop lower-power and smaller-area neural network hardware. However, 4- or lower bit quantization usually requires additional retraining with labeled dataset for backpropagation to improve test accuracy. In this paper, we propose a quantization scheme with distribution-based filter-wise quantization step without labeled dataset. ResNet-50 model with 8-bit activation and 3.04-bit weight precision quantized with the proposed techniques achieves top-1 inference accuracy of 74.3 % on ImageNet.

Details

Database :
OpenAIRE
Journal :
2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)
Accession number :
edsair.doi...........35cfe27cfed55eaa1fa3d4dabebddb6b