Start Over

A single-stream adaptive scene layout modeling method for scene recognition.

Authors :: Wang, Qun
Zhu, Feng
Lin, Zhiyuan
Wang, Jianyu
Li, Xiang
Zhao, Pengfei
Source :: Neural Computing & Applications. Aug2024, Vol. 36 Issue 22, p13703-13714. 12p.
Publication Year :: 2024
Abstract: Scene recognition has been the foundation of research in computer vision fields. Because scene images typically are composed of specific regions distributed in some layout, so modeling layouts of various scenes is a key clue for scene recognition. Existing methods usually require an additional stream to detect regions for subsequent modeling, which accumulate errors and may miss important information. Meanwhile, they use manual features to model relations between regions, which weakens the representation ability of layouts. In this paper, we propose a single-stream adaptive scene layout modeling approach based on a layout modeling module (LMM), which constructs layouts without additional detection streams and adaptively captures the relations to take advantage of graph attention network. LMM is directly concatenated to a convolutional neural network, where each pixel of the activation maps of the last convolutional layer is defined as a region that is the initial input node of the LMM. LMM first models the layout of each region, and then uses all regions with layout information to model the entire scene. Layout relations are encoded as edges, which are automatically analyzed according to region co-occurrence and relative position. Our work can be understood as optimizing features of the activation maps from a scene layout modeling perspective for scene recognition. Experimental results on MIT67, SUN397, and Places365 show that our single-stream model achieves competitive performance. [ABSTRACT FROM AUTHOR]