Back to Search
Start Over
Multimodal Framework for Long-Tailed Recognition.
- Source :
- Applied Sciences (2076-3417); Nov2024, Vol. 14 Issue 22, p10572, 14p
- Publication Year :
- 2024
-
Abstract
- Long-tailed data distribution (i.e., minority classes occupy most of the data, while most classes have very few samples) is a common problem in image classification. In this paper, we propose a novel multimodal framework for long-tailed data recognition. In the first stage, long-tailed data are used for visual-semantic contrastive learning to obtain good features, while in the second stage, class-balanced data are used for classifier training. The proposed framework leverages the advantages of multimodal models and mitigates the problem of class imbalance in long-tailed data recognition. Experimental results demonstrate that the proposed framework achieves competitive performance on the CIFAR-10-LT, CIFAR-100-LT, ImageNet-LT, and iNaturalist2018 datasets for image classification. [ABSTRACT FROM AUTHOR]
- Subjects :
- IMAGE recognition (Computer vision)
DATA distribution
CLASSIFICATION
Subjects
Details
- Language :
- English
- ISSN :
- 20763417
- Volume :
- 14
- Issue :
- 22
- Database :
- Complementary Index
- Journal :
- Applied Sciences (2076-3417)
- Publication Type :
- Academic Journal
- Accession number :
- 181174066
- Full Text :
- https://doi.org/10.3390/app142210572