Back to Search Start Over

基于空间注意力图的知识蒸馏算法.

Authors :
王礼乐
刘 渊
Source :
Application Research of Computers / Jisuanji Yingyong Yanjiu. Jun2024, Vol. 41 Issue 6, p1693-1698. 6p.
Publication Year :
2024

Abstract

Knowledge distillation algorithms have a great effect on the streamlining of deep neural networks. The current feature-based knowledge distillation algorithms either focus on a single part for improvement and ignore other beneficial parts, or provides effective guidance for the part that a small model should focus on, which makes the distillation effect insufficient. In order to make full use of the beneficial information of the large model and process it to improve the knowledge conversion rate of the small model, this paper proposed a new distillation algorithm. Firstly, it used the conditional probability distribution to fit the feature spatial distribution of the large model's middle layer, and then extracted the spatial attention maps that tended to be similar after fitting together with other beneficial information. Finally, it used the small convolutional layer, narrowed the gap between models, transmitted the transformed information to the small model to achieve distillation. Experimental results show that the algorithm has the applicability of multiple teacher-student combinations and the generality of multiple data sets, and compared with the current more advanced distillation algorithms, the performance is improved by about 1.19% and the time is shortened by 0.16 h. It has important engineering significance and wide application prospects for large networks' optimization and the application of deep learning on low-resource devices. [ABSTRACT FROM AUTHOR]

Details

Language :
Chinese
ISSN :
10013695
Volume :
41
Issue :
6
Database :
Academic Search Index
Journal :
Application Research of Computers / Jisuanji Yingyong Yanjiu
Publication Type :
Academic Journal
Accession number :
177823938
Full Text :
https://doi.org/10.19734/j.issn.1001-3695.2023.10.0496