Back to Search Start Over

Cross-Architecture Knowledge Distillation.

Authors :
Liu, Yufan
Cao, Jiajiong
Li, Bing
Hu, Weiming
Ding, Jingting
Li, Liang
Maybank, Stephen
Source :
International Journal of Computer Vision. Aug2024, Vol. 132 Issue 8, p2798-2824. 27p.
Publication Year :
2024

Abstract

The Transformer network architecture has gained attention due to its ability to learn global relations and its superior performance. To boost performance, it is natural to distill complementary knowledge from a Transformer network to a convolutional neural network (CNN). However, most existing knowledge distillation methods only consider homologous-architecture distillation, which may not be suitable for cross-architecture scenarios, such as from Transformer to CNN. To address this problem, we analyze the globality and transferability of models, which reflect the ability to capture global knowledge and transfer knowledge from teacher to student, respectively. Inspired by our observations, a novel cross-architecture knowledge distillation method is proposed, which supports bi-directional distillation including from Transformer to CNN and from CNN to Transformer. Specifically, rather than directly mimicking the output and intermediate features of the teacher, a partial cross-attention projector (PCA/iPCA) and a group-wise linear projector (GL/iGL) are introduced to align the student features with the teacher's in two projected feature spaces. To better match the teacher's knowledge with the student's knowledge, an adaptive distillation router (ADR) is presented to decide the knowledge from which layer the teacher should be distilled to guide which layer of the student. A multi-view robust training scheme is further presented, to improve the robustness of the framework for distillation. Extensive experiments show that the proposed method outperforms 17 state-of-the-art methods on both small-scale and large-scale datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09205691
Volume :
132
Issue :
8
Database :
Academic Search Index
Journal :
International Journal of Computer Vision
Publication Type :
Academic Journal
Accession number :
178402110
Full Text :
https://doi.org/10.1007/s11263-024-02002-0