Start Over

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition

Authors :: Wei, Yixuan
Cao, Yue
Zhang, Zheng
Yao, Zhuliang
Xie, Zhenda
Hu, Han
Guo, Baining
Publication Year :: 2022
Abstract: Image classification, which classifies images by pre-defined categories, has been the dominant approach to visual representation learning over the last decade. Visual learning through image-text alignment, however, has emerged to show promising performance, especially for zero-shot recognition. We believe that these two learning tasks are complementary, and suggest combining them for better visual learning. We propose a deep fusion method with three adaptations that effectively bridge two learning tasks, rather than shallow fusion through naive multi-task learning. First, we modify the previous common practice in image classification, a linear classifier, with a cosine classifier which shows comparable performance. Second, we convert the image classification problem from learning parametric category classifier weights to learning a text encoder as a meta network to generate category classifier weights. The learnt text encoder is shared between image classification and image-text alignment. Third, we enrich each class name with a description to avoid confusion between classes and make the classification method closer to the image-text alignment. We prove that this deep fusion approach performs better on a variety of visual recognition tasks and setups than the individual learning or shallow fusion approach, from zero-shot/few-shot image classification, such as the Kornblith 12-dataset benchmark, to downstream tasks of action recognition, semantic segmentation, and object detection in fine-tuning and open-vocabulary settings. The code will be available at https://github.com/weiyx16/iCAR.<br />Comment: 22 pages, 6 figures

Subjects :: Computer Science - Computer Vision and Pattern Recognition

Details

Database :: arXiv
Publication Type :: Report
Accession number :: edsarx.2204.10760
Document Type :: Working Paper

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

iCAR: Bridging Image Classification and Image-text Alignment for Visual Recognition

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources