Back to Search
Start Over
RCSFN: A remote sensing image scene classification and recognition network based on rectangle convolutional self attention fusion.
- Source :
- Signal, Image & Video Processing; Dec2024, Vol. 18 Issue 12, p8739-8756, 18p
- Publication Year :
- 2024
-
Abstract
- Remote sensing scene classification is a critical task in the processing and analysis of remote sensing images. Traditional methods typically use standard convolutional kernels to extract feature information. Although these methods have seen improvements, they still struggle to fully capture unique local details, thus affecting classification accuracy. Each category within remote sensing scenes has its unique local details, such as the rectangular features of buildings in schools or industrial areas, as well as bridges and roads in parks or squares. The most important features are often these rectangular structures and their spatial positions, which standard convolutional kernels find challenging to capture effectively.To address this issue, we propose a remote sensing scene classification method based on a Rectangle Convolution Self-Attention Fusion Network (RCSFN) architecture. In the RCSFN network, the Rectangle Convolution Maximum Fusion (RCMF) module operates in parallel with the first 4 × 4 convolutional layer of VanillaNet-5. The RCMF module uses two different rectangular convolutional kernels to extract different receptive fields, enhancing the extraction of shallow local features through addition and fusion. This process, combined with the concatenation of the original input features, results in richer local detail information.Additionally, we introduce an Area Selection (AS) module that focuses on selecting feature information within local regions. The Sequential Polarisation Self-Attention (SPS) mechanism, integrated with the Mini Region Convolution (MRC) module through feature multiplication, enhances important features and improves spatial positional relationships, thereby increasing the accuracy of recognising categories with rectangular or elongated features. Experiments were carried out on AID and NWPU-RESISC45 data sets, and the overall classification accuracy was 96.56% and 92.46%, respectively. This shows that the RCSFN network model proposed in this paper is feasible and effective for class classification problems with unique local detail features. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 18631703
- Volume :
- 18
- Issue :
- 12
- Database :
- Complementary Index
- Journal :
- Signal, Image & Video Processing
- Publication Type :
- Academic Journal
- Accession number :
- 180654595
- Full Text :
- https://doi.org/10.1007/s11760-024-03511-8