Back to Search Start Over

RingMo-SAM: A Foundation Model for Segment Anything in Multimodal Remote-Sensing Images

Authors :
Yan, Zhiyuan
Li, Junxi
Li, Xuexue
Zhou, Ruixue
Zhang, Wenkai
Feng, Yingchao
Diao, Wenhui
Fu, Kun
Sun, Xian
Source :
IEEE Transactions on Geoscience and Remote Sensing; 2023, Vol. 61 Issue: 1 p1-16, 16p
Publication Year :
2023

Abstract

The proposal of the segment anything model (SAM) has created a new paradigm for the deep-learning-based semantic segmentation field and has shown amazing generalization performance. However, we find it may fail or perform poorly on multimodal remote-sensing scenarios, especially synthetic aperture radar (SAR) images. Besides, SAM does not provide category information for objects. In this article, we propose a foundation model for multimodal remote-sensing image segmentation called RingMo-SAM, which can not only segment anything in optical and SAR remote-sensing data, but also identify object categories. First, a large-scale dataset containing millions of segmentation instances is constructed by collecting multiple open-source datasets in this field to train the model. Then, by constructing an instance-type and terrain-type category-decoupling mask decoder (CDMDecoder), the categorywise segmentation of various objects is achieved. In addition, a prompt encoder embedded with the characteristics of multimodal remote-sensing data is designed. It not only supports multibox prompts to improve the segmentation accuracy of multiobjects in complicated remote-sensing scenes, but also supports SAR characteristics prompts to improve the segmentation performance on SAR images. Extensive experimental results on several datasets including iSAID, ISPRS Vaihingen, ISPRS Potsdam, AIR-PolSAR-Seg, and so on have demonstrated the effectiveness of our method.

Details

Language :
English
ISSN :
01962892 and 15580644
Volume :
61
Issue :
1
Database :
Supplemental Index
Journal :
IEEE Transactions on Geoscience and Remote Sensing
Publication Type :
Periodical
Accession number :
ejs64725238
Full Text :
https://doi.org/10.1109/TGRS.2023.3332219