1. Wider or Deeper: Revisiting the ResNet Model for Visual Recognition.
- Author
-
Wu, Zifeng, Shen, Chunhua, and van den Hengel, Anton
- Subjects
- *
ARTIFICIAL neural networks , *IMAGE segmentation , *DEEP learning , *DIGITAL image processing , *MACHINE learning - Abstract
Highlights • We further develop the unravelled view of ResNets, which helps us better understand their behaviours. We demonstrate this in the context of a training process, which is the key difference from the original version 1. • We propose a group of relatively shallow convolutional networks based on our new understanding. Some of them perform comparably with the state-of-the-art approaches on the ImageNet classification dataset 2. • We evaluate the impact of using different networks on the performance of semantic image segmentation, and show these networks, as pre-trained features, can boost existing algorithms a lot. Abstract The community has been going deeper and deeper in designing one cutting edge network after another, yet some works are there suggesting that we may have gone too far in this dimension. Some researchers unravelled a residual network into an exponentially wider one, and assorted the success of residual networks to fusing a large amount of relatively shallow models. Since some of their early claims are still not settled, we in this paper dig more on this topic, i.e., the unravelled view of residual networks. Based on that, we try to find a good compromise between the depth and width. Afterwards, we walk through a typical pipeline of developing a deep-learning-based algorithm. We start from a group of relatively shallow networks, which perform as well or even better than the current (much deeper) state-of-the-art models on the ImageNet classification dataset. Then, we initialize fully convolutional networks (FCNs) using our pre-trained models, and tune them for semantic image segmentation. Results show that the proposed networks, as pre-trained features, can boost existing methods a lot. Even without exhausting the sophistical techniques to improve the classic FCN model, we achieve comparable results with the best performers on four widely-used datasets, i.e., Cityscapes, PASCAL VOC, ADE20k and PASCAL-Context. The code and pre-trained models are released for public access 1 1 https://github.com/itijyou/ademxapp. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF