Back to Search
Start Over
Post Training Weight Compression with Distribution-based Filter-wise Quantization Step
- Source :
- COOL CHIPS
- Publication Year :
- 2019
- Publisher :
- IEEE, 2019.
-
Abstract
- Quantization of models with lower bit precision is a promising method to develop lower-power and smaller-area neural network hardware. However, 4- or lower bit quantization usually requires additional retraining with labeled dataset for backpropagation to improve test accuracy. In this paper, we propose a quantization scheme with distribution-based filter-wise quantization step without labeled dataset. ResNet-50 model with 8-bit activation and 3.04-bit weight precision quantized with the proposed techniques achieves top-1 inference accuracy of 74.3 % on ImageNet.
Details
- Database :
- OpenAIRE
- Journal :
- 2019 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)
- Accession number :
- edsair.doi...........35cfe27cfed55eaa1fa3d4dabebddb6b