1. Bridge the Spatial Semantic Gap: Multiattention Shape Adaptive Network for Infrared Target Segmentation
- Author
-
Yang, Benyi, Zhu, Xiaozhou, Liu, Yibing, Zhou, Jiexin, Li, Shuaixin, Zhu, Hai, and Yao, Wen
- Abstract
For infrared target segmentation, convolutional neural network-based (CNN-based) methods have largely improved the segmentation accuracy. Previous CNN-based methods typically improved the segmentation accuracy by enhancing and aggregating multi-level features. However, the existence of a semantic gap between features from different semantic levels limits the improvement of feature aggregation. To address this problem and improve the segmentation accuracy of infrared targets with tiny sizes and irregular shapes, the multiattention shape adaptive network (MASANet) is proposed in this article. Specifically, a spatial feature fusion dense-nested Unet (SFFDUnet) is designed as the feature extraction and aggregation backbone. Based on SFFDUnet, the bi-directional shape adaptive attention with channel-wise shape enhance (BSAA-CSE) block is proposed as a multiattention mechanism to reduce the semantic gap between features. Moreover, the multiscale multibranch deep supervision is introduced to stabilize the training process and enable model pruning. In addition, a new infrared target dataset CITY-IRTD is also constructed to evaluate the performance of our method. Experiments demonstrate the effectiveness of our method in reducing the semantic gap between features to be fused. As a result, our MASANet has achieved better performance on both public and our self-developed datasets, compared to other state-of-the-art infrared target segmentation methods. Codes and the dataset will be released at
https://github.com/twentyfiveYang/MASANet_CITY-IRTD .- Published
- 2024
- Full Text
- View/download PDF