1. Temporo-Spatial Parallel Sparse Memory Networks for Efficient Video Object Segmentation
- Author
-
Dang, Jisheng, Zheng, Huicheng, Wang, Bimei, Wang, Longguang, and Guo, Yulan
- Abstract
Memory-based networks have achieved tremendous success in video object segmentation. However, these methods still suffer from unfaithful segmentation and inferior efficiency under complicated video scenarios. The reasons are mainly threefold: 1) Weak perception of fast-moving targets due to individual frame memory patterns without capturing inter-frame motion; 2) Lack of discrimination to visually similar appearances due to the limited receptive field; 3) Redundant computation caused by matching with all memorized frames. To address these issues, we propose a Temporo-Spatial Parallel Sparse Memory network (TSPSM) for efficient video object segmentation. Our TSPSM constructs a temporal memory bank and a spatial memory bank in parallel to memorize complementary discriminative object cues. The temporal bank exploits discriminative temporal motion cues, while the spatial bank mines spatial context cues between adjacent frames with large receptive fields, thereby alleviating the ambiguity caused by similar instances and fast movements. To reduce redundant computation without sacrificing performance during the matching step, we further design a parallel sparse memory reader based on the constructed informative memory banks, which efficiently retrieves relevant temporal and spatial information in a parallel way. Experiments demonstrate that our TSPSM achieves state-of-the-art performance with real-time speed on DAVIS, and YouTube-VOS benchmarks. Furthermore, extensive experiments show that the proposed TSPMC module can be applied to existing methods as a generic plugin to significantly improve performance.
- Published
- 2024
- Full Text
- View/download PDF