Back to Search Start Over

UMINet: a unified multi-modality interaction network for RGB-D and RGB-T salient object detection.

Authors :
Gao, Lina
Fu, Ping
Xu, Mingzhu
Wang, Tiantian
Liu, Bing
Source :
Visual Computer. Mar2024, Vol. 40 Issue 3, p1565-1582. 18p.
Publication Year :
2024

Abstract

Multi-modality images with complementary cues can significantly improve the performance of salient object detection (SOD) methods in challenging scenes. However, existing methods are specially designed for RGB-D or RGB-T SOD in general, thus it is necessary to bridge the gap to develop a unified SOD framework for processing various multi-modality images. To address this issue, we propose a unified multi-modality interaction fusion framework for RGB-D and RGB-T SOD, named UMINet. We deeply investigate the differences between appearance maps and complementary images and design an asymmetric backbone to extract appearance features and complementary cues. For the complementary cues branch, a complementary information aware module (CIAM) is proposed to perceive and enhance the weights of complementary modality features. We also propose a multi-modality difference fusion (MDF) block to fuse cross-modality features. This MDF block simultaneously considers the differences and consistency between the appearance features and complementary features. Furthermore, to promote the rich contextual dependencies and integrate cross-level multi-modality features, we design a mutual refinement decoder (MRD) to progressively predict salient results. The MRD consists of three reverse perception blocks (RPB) and five sub-decoders. Extensive experiments are provided to indicate the substantial improvement achieved by the proposed UMINet over the existing state-of-the-art (SOTA) models on six RGB-D SOD datasets and three RGB-T SOD datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01782789
Volume :
40
Issue :
3
Database :
Academic Search Index
Journal :
Visual Computer
Publication Type :
Academic Journal
Accession number :
175459324
Full Text :
https://doi.org/10.1007/s00371-023-02870-6