1. Don't Forget The Past: Recurrent Depth Estimation from Monocular Video
- Author
-
Dengxin Dai, Wouter Van Gansbeke, Vaishakh Patil, and Luc Van Gool
- Subjects
FOS: Computer and information sciences ,Technology ,Computer Science - Machine Learning ,Control and Optimization ,Computer science ,Computer Vision and Pattern Recognition (cs.CV) ,Biomedical Engineering ,Computer Science - Computer Vision and Pattern Recognition ,Machine Learning (cs.LG) ,Image (mathematics) ,Computer Science - Robotics ,Artificial Intelligence ,autonomous vehicle navigation ,FOS: Electrical engineering, electronic engineering, information engineering ,Deep learning for visual perception ,RGBD perception ,Sensor fusion ,Novel deep learning methods ,Autonomous vehicle navigation ,Computer vision ,sensor fusion ,Estimation ,Science & Technology ,Monocular ,Series (mathematics) ,business.industry ,Mechanical Engineering ,Image and Video Processing (eess.IV) ,Robotics ,Electrical Engineering and Systems Science - Image and Video Processing ,Computer Science Applications ,Human-Computer Interaction ,novel deep learning methods ,Control and Systems Engineering ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Robotics (cs.RO) - Abstract
Autonomous cars need continuously updated depth information. Thus far, depth is mostly estimated independently for a single frame at a time, even if the method starts from video input. Our method produces a time series of depth maps, which makes it an ideal candidate for online learning approaches. In particular, we put three different types of depth estimation (supervised depth prediction, self-supervised depth prediction, and self-supervised depth completion) into a common framework. We integrate the corresponding networks with a ConvLSTM such that the spatiotemporal structures of depth across frames can be exploited to yield a more accurate depth estimation. Our method is flexible. It can be applied to monocular videos only or be combined with different types of sparse depth patterns. We carefully study the architecture of the recurrent network and its training strategy. We are first to successfully exploit recurrent networks for real-Time self-supervised monocular depth estimation and completion. Extensive experiments show that our recurrent method outperforms its image-based counterpart consistently and significantly in both self-supervised scenarios. It also outperforms previous depth estimation methods of the three popular groups. Please refer to our webpage for details. © 2016 IEEE., IEEE Robotics and Automation Letters, 5 (4), ISSN:2377-3766
- Published
- 2020