1. A multi-level collaborative self-distillation learning for improving adaptive inference efficiency
- Author
-
Likun Zhang, Jinbao Li, Benqian Zhang, and Yahong Guo
- Subjects
Multi-level collaborative self-distillation ,Multi-exit network ,Collaborative learning ,Adaptive inference ,Efficient computing ,Electronic computers. Computer science ,QA75.5-76.95 ,Information technology ,T58.5-58.64 - Abstract
Abstract A multi-exit network is an important technique for achieving adaptive inference by dynamically allocating computational resources based on different input samples. The existing works mainly treat the final classifier as the teacher, enhancing the classification accuracy by transferring knowledge to the intermediate classifiers. However, this traditional self-distillation training strategy only utilizes the knowledge contained in the final classifier, neglecting potentially distinctive knowledge in the other classifiers. To address this limitation, we propose a novel multi-level collaborative self-distillation learning strategy (MLCSD) that extracts knowledge from all the classifiers. MLCSD dynamically determines the weight coefficients for each classifier’s contribution through a learning process, thus constructing more comprehensive and effective teachers tailored to each classifier. These new teachers transfer the knowledge back to each classifier through a distillation technique, thereby further improving the network’s inference efficiency. We conduct experiments on three datasets, CIFAR10, CIFAR100, and Tiny-ImageNet. Compared with the baseline network that employs traditional self-distillation, our MLCSD-Net based on ResNet18 enhances the average classification accuracy by 1.18%. The experimental results demonstrate that MLCSD-Net improves the inference efficiency of adaptive inference applications, such as anytime prediction and budgeted batch classification. Code is available at https://github.com/deepzlk/MLCSD-Net .
- Published
- 2024
- Full Text
- View/download PDF