Portrait Semantic Segmentation Method Based on Dual Modal Information Complementarity

Authors :: Guang Feng
Chong Tang
Source :: Applied Sciences, Vol 14, Iss 4, p 1439 (2024)
Publication Year :: 2024
Publisher :: MDPI AG, 2024.
Abstract: Semantic segmentation of human images is a research hotspot in the field of computer vision. At present, the semantic segmentation models based on U-net generally lack the ability to capture the spatial information of images. At the same time, semantic incompatibility exists because the feature maps of encoder and decoder are directly connected in the skip connection stage. In addition, in low light scenes such as at night, it is easy for false segmentation and segmentation accuracy to appear. To solve the above problems, a portrait semantic segmentation method based on dual-modal information complementarity is proposed. The encoder adopts a double branch structure, and uses a SK-ASSP module that can adaptively adjust the convolution weights of different receptor fields to extract features in RGB and gray image modes respectively, and carries out cross-modal information complementarity and feature fusion. A hybrid attention mechanism is used in the jump connection phase to capture both the channel and coordinate context information of the image. Experiments on human matting dataset show that the PA and MIoU coefficients of this algorithm model reach 96.58% and 94.48% respectively, which is better than U-net benchmark model and other mainstream semantic segmentation models.

Subjects :: portrait segmentation
U-net
space pyramid pooling
attention mechanism
cross model
Technology
Engineering (General). Civil engineering (General)
TA1-2040
Biology (General)
QH301-705.5
Physics
QC1-999
Chemistry
QD1-999

Full Text Access

Tools