Back to Search Start Over

Fusion-competition framework of local topology and global texture for head pose estimation.

Authors :
Ma, Dongsheng
Fu, Tianyu
Yang, Yifei
Cao, Kaibin
Fan, Jingfan
Xiao, Deqiang
Song, Hong
Gu, Ying
Yang, Jian
Source :
Pattern Recognition. May2024, Vol. 149, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

• The proposed method combined the heterogeneous data to fully utilizes the texture information of RGB image and the geometric information of point cloud. Compared with depth image, the point cloud has more powerful topology feature, which can be learned with texture feature for accurately and robustly head pose estimation. • The proposed framework is constructed to achieve the feature fusion in the texture-topology level and generate the feature competition among the local regions. This fusion-competition framework enhances the expression of the features with different categories in the different levels to decrease the estimation error and increase the stability. • This paper constructed an RGB-Depth dataset using HoloLens2 for training and testing in head pose estimation. This dataset has abundant head pose samples including 24 sessions with 12 K frames from 21 males and 1 female, and the ground truth of pose in each frame is labeled by an accurate tracking device tied on the head. RGB image and point cloud involve texture and geometric structure, which are widely used for head pose estimation. However, images lack of spatial information, and the quality of point cloud is easily affected by sensor noise. In this paper, a novel fusion-competition framework (FCF) is proposed to overcome the limitations of a single modality. The global texture information is extracted from image and the local topology information is extracted from point cloud to project heterogeneous data into a common feature subspace. The projected texture feature weighted by the channel attention mechanism is embedded into each local point cloud region with different topological features for fusion. The scoring mechanism creates competition among the regions involving local-global fused features to predict final pose with the highest score. According to the evaluation results on the public and our constructed datasets, the FCF improves the estimation accuracy and stability by an average of 13.6 % and 12.7 %, which is compared to nine state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00313203
Volume :
149
Database :
Academic Search Index
Journal :
Pattern Recognition
Publication Type :
Academic Journal
Accession number :
175681424
Full Text :
https://doi.org/10.1016/j.patcog.2024.110285