Back to Search Start Over

Facial Expression Recognition Based on Vision Transformer with Hybrid Local Attention.

Authors :
Tian, Yuan
Zhu, Jingxuan
Yao, Huang
Chen, Di
Source :
Applied Sciences (2076-3417); Aug2024, Vol. 14 Issue 15, p6471, 15p
Publication Year :
2024

Abstract

Facial expression recognition has wide application prospects in many occasions. Due to the complexity and variability of facial expressions, facial expression recognition has become a very challenging research topic. This paper proposes a Vision Transformer expression recognition method based on hybrid local attention (HLA-ViT). The network adopts a dual-stream structure. One stream extracts the hybrid local features and the other stream extracts the global contextual features. These two streams constitute a global–local fusion attention. The hybrid local attention module is proposed to enhance the network's robustness to face occlusion and head pose variations. The convolutional neural network is combined with the hybrid local attention module to obtain feature maps with local prominent information. Robust features are then captured by the ViT from the global perspective of the visual sequence context. Finally, the decision-level fusion mechanism fuses the expression features with local prominent information, adding complementary information to enhance the network's recognition performance and robustness against interference factors such as occlusion and head posture changes in natural scenes. Extensive experiments demonstrate that our HLA-ViT network achieves an excellent performance with 90.45% on RAF-DB, 90.13% on FERPlus, and 65.07% on AffectNet. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20763417
Volume :
14
Issue :
15
Database :
Complementary Index
Journal :
Applied Sciences (2076-3417)
Publication Type :
Academic Journal
Accession number :
178949444
Full Text :
https://doi.org/10.3390/app14156471