Back to Search Start Over

HDKD: Hybrid data-efficient knowledge distillation network for medical image classification.

Authors :
EL-Assiouti, Omar S.
Hamed, Ghada
Khattab, Dina
Ebied, Hala M.
Source :
Engineering Applications of Artificial Intelligence. Dec2024:Part B, Vol. 138, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Vision Transformers (ViTs) have achieved significant advancement in computer vision tasks due to their powerful modeling capacity. However, their performance notably degrades when trained with insufficient data due to lack of inherent inductive biases. Distilling knowledge and inductive biases from a Convolutional Neural Network (CNN) teacher has emerged as an effective strategy for enhancing the generalization of ViTs on limited datasets. Previous approaches to Knowledge Distillation (KD) have pursued two primary paths: some focused solely on distilling the logit distribution from CNN teacher to ViT student, neglecting the rich semantic information present in intermediate features due to the structural differences between them. Others integrated feature distillation along with logit distillation, yet this introduced alignment operations that limits the amount of knowledge transferred due to mismatched architectures and increased the computational overhead. To this end, this paper presents H ybrid D ata-efficient K nowledge D istillation (HDKD) paradigm which employs a CNN teacher and a hybrid student. The choice of hybrid student serves two main aspects. First, it leverages the strengths of both convolutions and transformers while sharing the convolutional structure with the teacher model. Second, this shared structure enables the direct application of feature distillation without any information loss or additional computational overhead. Additionally, we propose an efficient light-weight convolutional block named M o b ile C hannel- S patial A ttention (MBCSA), which serves as the primary convolutional block in both teacher and student models. Extensive experiments on two medical public datasets showcase the superiority of HDKD over other state-of-the-art models and its computational efficiency. Source code at: https://github.com/omarsherif200/HDKD. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09521976
Volume :
138
Database :
Academic Search Index
Journal :
Engineering Applications of Artificial Intelligence
Publication Type :
Academic Journal
Accession number :
180994905
Full Text :
https://doi.org/10.1016/j.engappai.2024.109430