1. Video object segmentation based on dynamic perception update and feature fusion.
- Author
-
Hou, Zhiqiang, Li, Fucheng, Dong, Jiale, Dai, Nan, Ma, Sugang, and Fan, Jiulun
- Subjects
- *
LAPLACIAN operator , *PROBLEM solving , *MEMORY , *ALGORITHMS , *VIDEOS - Abstract
The current popular video object segmentation algorithms based on memory network indiscriminately update the frame information to the memory pool, fails to make reasonable use of the historical frame information, causing frame information redundancy in the memory pool, resulting in the increase of the computation amount. At the same time, the mask refinement method is relatively rough, resulting in blurred edges of the generated mask. To solve these problems, This paper proposes a video object segmentation algorithm based on dynamic perception update and feature fusion. In order to reasonably utilize the historical frame information, a dynamic perception update module is proposed to selectively update the segmentation frame mask. Meanwhile, a mask refinement module is established to enhance the detail information of the shallow features of the backbone network. This module uses a double kernels fusion block to fuse the different scale information of the features, and finally uses the Laplacian operator to sharpen the edges of the mask. The experimental results show that on the public datasets DAVIS2016, DAVIS2017 and YouTube-VOS 18 , the comprehensive performance of the algorithm in this paper reaches 86.9%, 79.3% and 71.6%, respectively, and the segmentation speed reaches 15FPS on the DAVIS2016 dataset. Compared with many mainstream algorithms in recent years, it has obvious advantages in performance. • Introduce the idea of key frames into video object segmentation and propose a dynamic perception update module. By calculating the IoU changes between consecutive frames, we determine whether to update the current frame information to the memory pool, thereby reducing the redundancy of frame information in the memory pool. • In order to make the final mask more refined, this paper proposes a mask refinement module, which uses channel attention to adaptively weight the target channels of shallow features, the dual-core fusion block is used to expand the fusion feature receptive field, and using edge detection operators to enhance the edge information of features. • Compared with the current state-of-the-art video object segmentation methods, the algorithm in this paper has obvious advantages on multiple public datasets, which proves the superiority of the algorithm in this paper. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF