Back to Search Start Over

Cross-level and multiscale CNN-Transformer network for automatic building extraction from remote sensing imagery.

Authors :
Yuan, Qinglie
Xia, Bin
Source :
International Journal of Remote Sensing. May2024, Vol. 45 Issue 9, p2893-2914. 22p.
Publication Year :
2024

Abstract

Automatic building detection from remote sensing images holds significant applications in various domains, including cartography, land-use change detection, urban planning, and 3D intelligent city construction. Driven by the innovation of deep learning technology, convolutional neural networks (CNNs) have emerged as a powerful framework for automatic interpretation, exhibiting hierarchical feature generation and robust representation for building extraction. However, the inherent computational limitations of convolutional operators hinder the acquisition of an effective global receptive field. Recent advancements in deep learning have witnessed the adoption of the multi-head self-attention (MHSA) mechanism by Transformer, presenting the superiority of global context construction. Nonetheless, this network structure encounters difficulties in effectively modelling two-dimensional spatial structure features with scale variation in semantic segmentation and often has huge parameters with high algorithm complexity. To address the above problems, this paper proposed a hybrid deep neural network synergistically harnessing the superiority of CNN and Transformer to extract buildings from remote sensing images. Meanwhile, an improved MHSA mechanism is developed to facilitate the construction of multiscale global representations, concurrently reducing computational complexity. Furthermore, the cross-level MHSA modules the semantic correlation between the deep and shallow-level features to refine spatial fine-grained information. The proposed method was comprehensively evaluated on two building datasets via quantitative analysis and visualization. The experimental results confirmed that the developed network and modules outperform other state-of-the-art methods and can effectively improve the accuracy (93.4% F1-score on the WHU dataset, 89.38% F1-score on the CAMB dataset) with high efficiency for building extraction. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01431161
Volume :
45
Issue :
9
Database :
Academic Search Index
Journal :
International Journal of Remote Sensing
Publication Type :
Academic Journal
Accession number :
176985589
Full Text :
https://doi.org/10.1080/01431161.2024.2339199