Back to Search Start Over

Joint self-supervised learning of interest point, descriptor, depth, and ego-motion from monocular video.

Authors :
Wang, Zhongyi
Shen, Mengjiao
Chen, Qijun
Source :
Multimedia Tools & Applications; Sep2024, Vol. 83 Issue 32, p77529-77547, 19p
Publication Year :
2024

Abstract

This paper addresses the self-supervised learning of several critical factors in Visual Simultaneous Localization and Mapping (VSLAM) in low-level vision: interest point learning, descriptor learning, ego-motion estimation, and depth estimation. The key insight we have is that appearance and geometry constraints can be used to couple these fundamental vision issues. We propose a self-supervised framework for joint training of neural networks for multiple objectives to address complicated issues, simplify systems, and provide important information for deep monocular VSLAM systems. First, we input two adjacent images into pose and depth networks to obtain their corresponding depth maps and camera poses. Then, we employ a differentiable geometry module and utilize the depth maps and camera poses to generate pseudo-input images needed for the interest point network and construct the geometry loss. Further, we input the pseudo-input image and source image into the interest point network to obtain the corresponding interest points, descriptors, and scores. Subsequently, we construct the appearance loss. Finally, we combine the geometry and appearance losses to constrain the whole network in an unsupervised manner. The novelty of this paper is that it integrates the key information necessary in monocular VSLAM into a unified framework that takes into account interest point learning, descriptor learning, ego-motion estimation, and depth estimation at the same time. Without providing any ground truth, our model can combine sub-problems for self-supervised learning and achieve state-of-the-art performance in their respective domains. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13807501
Volume :
83
Issue :
32
Database :
Complementary Index
Journal :
Multimedia Tools & Applications
Publication Type :
Academic Journal
Accession number :
179439256
Full Text :
https://doi.org/10.1007/s11042-024-18382-x