1. 基于特征调节器和双路径引导的 RGB-D 室内语义分割.
- Author
-
张帅, 雷景生, 靳伍银, 俞云祥, and 杨胜英
- Abstract
Aiming at the problems of inaccurate semantic segmentation results and rough saliency maps of indoor scene images, this paper proposed a network architecture (feature regulator and dual-path guidance, FG-Net) based on multi-modal feature optimization extraction and dual-path guided decoding. Specifically, the feature regulator sequentially performed noise filtering, re-weighted representation, differential complementation and interactive fusion on the multi-modal features at each stage, and optimized multi-modal feature representation in the feature extraction process by strengthening RGB and depth feature aggregation. Then, the dual-path guidance component introduced rich cross-modal cues after feature interactive fusion in the decoding stage to further take advantage of multi-modal features. The dual-path cooperative guidance structure outputted a more detailed saliency map by integrating multi-scale and multi-level feature information in the decoding stage. This paper conducted experiments on the public datasets NYUD-v2 and SUN RGB-D, and achieved 48.5% in the main evaluation metric mIoU, which is better than other state-of-the-art algorithms. The results show that the algorithm achieves more refined semantic segmentation of indoor scene images, and has good generalization and robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF