1. Residual Quantization for Low Bit-Width Neural Networks
- Author
-
Wenjun Zhang, Bingbing Ni, Zefan Li, Wen Gao, Teng Li, and Xiaokang Yang
- Subjects
Artificial neural network ,Computer science ,Quantization (signal processing) ,Binary number ,Maximization ,Residual ,Computer Science Applications ,Acceleration ,Compression (functional analysis) ,Signal Processing ,Media Technology ,Electrical and Electronic Engineering ,Representation (mathematics) ,Algorithm - Abstract
Neural network quantization has shown to be an effective way for network compression and acceleration. However, existing binary or ternary quantization methods suffer from two major issues. First, low bit-width input/activation quantization easily results in severe prediction accuracy degradation. Second, network training and quantization are always treated as two non-related tasks, leading to accumulated parameter training error and quantization error. In this work, we introduce a novel scheme, named Residual Quantization, to train a neural network with both weights and inputs constrained to low bit-width, e.g., binary or ternary values. On one hand, by recursively performing residual quantization, the resulting binary/ternary network is guaranteed to approximate the full-precision network with much smaller errors. On the other hand, we mathematically re-formulate the network training scheme in an EM-like manner, which iteratively performs network quantization and parameter optimization. During expectation, the low bit-width network is encouraged to approximate the full-precision network. During maximization, the low bit-width network is further tuned to gain better representation capability. Extensive experiments well demonstrate that the proposed quantization scheme outperforms previous low bit-width methods and achieves much closer performance to the full-precision counterpart.
- Published
- 2023