Back to Search Start Over

Few-Shot Image Classification of Crop Diseases Based on Vision–Language Models

Authors :
Yueyue Zhou
Hongping Yan
Kun Ding
Tingting Cai
Yan Zhang
Source :
Sensors, Vol 24, Iss 18, p 6109 (2024)
Publication Year :
2024
Publisher :
MDPI AG, 2024.

Abstract

Accurate crop disease classification is crucial for ensuring food security and enhancing agricultural productivity. However, the existing crop disease classification algorithms primarily focus on a single image modality and typically require a large number of samples. Our research counters these issues by using pre-trained Vision–Language Models (VLMs), which enhance the multimodal synergy for better crop disease classification than the traditional unimodal approaches. Firstly, we apply the multimodal model Qwen-VL to generate meticulous textual descriptions for representative disease images selected through clustering from the training set, which will serve as prompt text for generating classifier weights. Compared to solely using the language model for prompt text generation, this approach better captures and conveys fine-grained and image-specific information, thereby enhancing the prompt quality. Secondly, we integrate cross-attention and SE (Squeeze-and-Excitation) Attention into the training-free mode VLCD(Vision-Language model for Crop Disease classification) and the training-required mode VLCD-T (VLCD-Training), respectively, for prompt text processing, enhancing the classifier weights by emphasizing the key text features. The experimental outcomes conclusively prove our method’s heightened classification effectiveness in few-shot crop disease scenarios, tackling the data limitations and intricate disease recognition issues. It offers a pragmatic tool for agricultural pathology and reinforces the smart farming surveillance infrastructure.

Details

Language :
English
ISSN :
14248220
Volume :
24
Issue :
18
Database :
Directory of Open Access Journals
Journal :
Sensors
Publication Type :
Academic Journal
Accession number :
edsdoj.8da3df5cfe14afcb6e916dee6e33a89
Document Type :
article
Full Text :
https://doi.org/10.3390/s24186109