Start Over

S2-aware network for visual recognition.

Authors :: Zhao, Wenyi
Yang, Huihua
Pan, Xipeng
Li, Lingqiao
Source :: Signal Processing: Image Communication. Nov2021, Vol. 99, pN.PAG-N.PAG. 1p.
Publication Year :: 2021
Abstract: Capturing the comprehensive information of various sizes and shapes of images in the same convolution layer is typically a challenging task in computer vision. There are two main kinds of methods for capturing those features. The first uses the inception structure and its variants. The second utilizes larger convolution kernels on specific layers or stacks with more convolution blocks. However, these methods can result in computationally intensive or vanishing gradients. In this paper, to accommodate feature distributions with different sizes, shapes and reduce computational cost, we propose a width- and depth-aware module named the WD-module to match feature distributions. Moreover, the proposed WD-module consumes less computational cost and parameters compared with traditional residual convolution layers. To verify the effectiveness of our proposed method, a size- and shape-aware backbone network named S2A-Net was built, which was obtained by stacking the WD-modules. By visualizing heat maps and features, the proposed S2A-Net can adapt to objects with different sizes and shapes in visual recognition tasks and learn more comprehensive characteristics. Experimental results show that the proposed method has higher accuracy in image recognition and outperforms other state-of-the-art networks with the same numbers of layers. • A width-aware and deep-aware module is build. • A size-aware and shape-aware network for visual recognition is build. • Compared with other state-of-the-art methods, the proposed method can achieve better results while consume less parameters and FLOPs. [ABSTRACT FROM AUTHOR]