Quantization Method Integrated with Progressive Quantization and Distillation Learning.

Authors :: Huang, Heyan
Pan, Bocheng
Wang, Liwei
Jiang, Cheng
Source :: Procedia Computer Science; 2023, Vol. 228, p281-290, 10p
Publication Year :: 2023
Abstract: This paper proposed a quantization method based on the integration of progressive quantization and distillation learning, aiming to address the shortcomings of traditional quantization methods in maintaining model accuracy while reducing model size. This method converts the weight from floating point number to smaller integer number through progressive quantization, thus reducing the storage space required by the model. At the same time, distillation learning technology is used to integrate the gradually quantified model with the original floating-point model to improve the accuracy of the model. The experimental results show that the proposed method can maintain high accuracy of the model while reducing its size, and has better performance compared to traditional quantization methods. This method has a broad application prospect in model compression, and can be widely used in scenarios such as Edge device and cloud servers. [ABSTRACT FROM AUTHOR]

Full Text Access

Tools