1. DRCNN: Dynamic Routing Convolutional Neural Network for Multi-View 3D Object Recognition
- Author
-
Kai Sun, Zengjie Song, Jiangshe Zhang, Ruixuan Yu, and Junmin Liu
- Subjects
Computer science ,business.industry ,Deep learning ,Pooling ,Cognitive neuroscience of visual object recognition ,Pattern recognition ,02 engineering and technology ,Computer Graphics and Computer-Aided Design ,Convolutional neural network ,0202 electrical engineering, electronic engineering, information engineering ,Feature (machine learning) ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Affine transformation ,Artificial intelligence ,Routing (electronic design automation) ,business ,Software - Abstract
3D object recognition is one of the most important tasks in 3D data processing, and has been extensively studied recently. Researchers have proposed various 3D recognition methods based on deep learning, among which a class of view-based approaches is a typical one. However, in the view-based methods, the commonly used view pooling layer to fuse multi-view features causes a loss of visual information. To alleviate this problem, in this paper, we construct a novel layer called Dynamic Routing Layer (DRL) by modifying the dynamic routing algorithm of capsule network, to more effectively fuse the features of each view. Concretely, in DRL, we use rearrangement and affine transformation to convert features, then leverage the modified dynamic routing algorithm to adaptively choose the converted features, instead of ignoring all but the most active feature in view pooling layer. We also illustrate that the view pooling layer is a special case of our DRL. In addition, based on DRL, we further present a Dynamic Routing Convolutional Neural Network (DRCNN) for multi-view 3D object recognition. Our experiments on three 3D benchmark datasets show that our proposed DRCNN outperforms many state-of-the-arts, which demonstrates the efficacy of our method.
- Published
- 2021