Start Over

CTDUNet: A Multimodal CNN–Transformer Dual U-Shaped Network with Coordinate Space Attention for Camellia oleifera Pests and Diseases Segmentation in Complex Environments

Authors :: Ruitian Guo
Ruopeng Zhang
Hao Zhou
Tunjun Xie
Yuting Peng
Xili Chen
Guo Yu
Fangying Wan
Lin Li
Yongzhong Zhang
Ruifeng Liu
Source :: Plants, Vol 13, Iss 16, p 2274 (2024)
Publication Year :: 2024
Publisher :: MDPI AG, 2024.
Abstract: Camellia oleifera is a crop of high economic value, yet it is particularly susceptible to various diseases and pests that significantly reduce its yield and quality. Consequently, the precise segmentation and classification of diseased Camellia leaves are vital for managing pests and diseases effectively. Deep learning exhibits significant advantages in the segmentation of plant diseases and pests, particularly in complex image processing and automated feature extraction. However, when employing single-modal models to segment Camellia oleifera diseases, three critical challenges arise: (A) lesions may closely resemble the colors of the complex background; (B) small sections of diseased leaves overlap; (C) the presence of multiple diseases on a single leaf. These factors considerably hinder segmentation accuracy. A novel multimodal model, CNN–Transformer Dual U-shaped Network (CTDUNet), based on a CNN–Transformer architecture, has been proposed to integrate image and text information. This model first utilizes text data to address the shortcomings of single-modal image features, enhancing its ability to distinguish lesions from environmental characteristics, even under conditions where they closely resemble one another. Additionally, we introduce Coordinate Space Attention (CSA), which focuses on the positional relationships between targets, thereby improving the segmentation of overlapping leaf edges. Furthermore, cross-attention (CA) is employed to align image and text features effectively, preserving local information and enhancing the perception and differentiation of various diseases. The CTDUNet model was evaluated on a self-made multimodal dataset compared against several models, including DeeplabV3+, UNet, PSPNet, Segformer, HrNet, and Language meets Vision Transformer (LViT). The experimental results demonstrate that CTDUNet achieved an mean Intersection over Union (mIoU) of 86.14%, surpassing both multimodal models and the best single-modal model by 3.91% and 5.84%, respectively. Additionally, CTDUNet exhibits high balance in the multi-class segmentation of Camellia oleifera diseases and pests. These results indicate the successful application of fused image and text multimodal information in the segmentation of Camellia disease, achieving outstanding performance.

Subjects :: Camellia oleifera
multimodal
CTDUNet
Coordinate Space Attention
CNN–Transformer
segmentation
Botany
QK1-989

Details

Language :: English
ISSN :: 22237747
Volume :: 13
Issue :: 16
Database :: Directory of Open Access Journals
Journal :: Plants
Publication Type :: Academic Journal
Accession number :: edsdoj.36992e9d64f1a8e23e16933ac97d5
Document Type :: article
Full Text :: https://doi.org/10.3390/plants13162274

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

CTDUNet: A Multimodal CNN–Transformer Dual U-Shaped Network with Coordinate Space Attention for Camellia oleifera Pests and Diseases Segmentation in Complex Environments

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

CTDUNet: A Multimodal CNN–Transformer Dual U-Shaped Network with Coordinate Space Attention for Camellia oleifera Pests and Diseases Segmentation in Complex Environments

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources