Back to Search Start Over

CoupleUNet: Swin Transformer coupling CNNs makes strong contextual encoders for VHR image road extraction.

Authors :
Li, Ruirui
Chen, Tao
Liu, Yiran
Jiang, Haoyu
Source :
International Journal of Remote Sensing. Sep2023, Vol. 44 Issue 18, p5788-5813. 26p.
Publication Year :
2023

Abstract

Accurately segmenting roads is challenging due to substantial intra-class variations, indistinct inter-class distinctions, and occlusions caused by shadows, trees, and buildings. To address these challenges, it is crucial to simultaneously pay attention to important texture details and perceive global geometric context information. Recent research has shown that CNN-Transformer hybrid structures outperform using CNN or Transformer alone. CNN excels at extracting local detail features and Transformer naturally has the ability to perceive global context information. In this paper, we propose a dual-branch network module called ConSwin, which combines the advantages of ResNet and Swin Transformer for better road information extraction. Taking ConSwin as the basic encoding module, we build an hourglass-shaped network with an encoder-decoder structure. In order to better transfer texture and structural detail information between the encoder and decoder, we also design two novel connection blocks. We conduct comparative experiments on the Massachusetts and CHN6-CUG datasets, and the proposed method outperforms state-of-the-art methods in terms of overall accuracy, IOU, and F1 metrics. Other experiments verify the effectiveness of our method, while visualization results demonstrate its ability to obtain better road representations. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01431161
Volume :
44
Issue :
18
Database :
Academic Search Index
Journal :
International Journal of Remote Sensing
Publication Type :
Academic Journal
Accession number :
172868715
Full Text :
https://doi.org/10.1080/01431161.2023.2255353