Repmono: a lightweight self-supervised monocular depth estimation architecture for high-speed inference

Authors :: Guowei Zhang
Xincheng Tang
Li Wang
Huankang Cui
Teng Fei
Hulin Tang
Shangfeng Jiang
Source :: Complex & Intelligent Systems, Vol 10, Iss 6, Pp 7927-7941 (2024)
Publication Year :: 2024
Publisher :: Springer, 2024.
Abstract: Abstract Self-supervised monocular depth estimation has always attracted attention because it does not require ground truth data. Designing a lightweight architecture capable of fast inference is crucial for deployment on mobile devices. The current network effectively integrates Convolutional Neural Networks (CNN) with Transformers, achieving significant improvements in accuracy. However, this advantage comes at the cost of an increase in model size and a significant reduction in inference speed. In this study, we propose a network named Repmono, which includes LCKT module with a large convolutional kernel and RepTM module based on the structural reparameterisation technique. With the combination of these two modules, our network achieves both local and global feature extraction with a smaller number of parameters and significantly enhances inference speed. Our network, with 2.31MB parameters, shows significant accuracy improvements over Monodepth2 in experiments on the KITTI dataset. With uniform input dimensions, our network’s inference speed is 53.7% faster than R-MSFM6, 60.1% faster than Monodepth2, and 81.1% faster than MonoVIT-small. Our code is available at https://github.com/txc320382/Repmono .

Subjects :: Depth estimation
Large convolutional
Structural reparameterisation
Inference speed
Electronic computers. Computer science
QA75.5-76.95
Information technology
T58.5-58.64

Full Text Access

Tools