Back to Search Start Over

Continuous sign language recognition enhanced by dynamic attention and maximum backtracking probability decoding.

Authors :
Xiong, Sije
Zou, Chunlong
Yun, Juntong
Jiang, Du
Huang, Li
Liu, Ying
Xie, Yuanmin
Source :
Signal, Image & Video Processing; Jan2025, Vol. 19 Issue 1, p1-13, 13p
Publication Year :
2025

Abstract

Sign language utilizes changes in hand shape, body movements, and facial expressions to collaboratively convey information. Most of the current continuous sign language recognition (CSLR) models focus on extracting information from each frame of the video, neglecting the dynamical changing characteristics of the signer across multiple frames. This contrasts with the essence of continuous sign language recognition: which aims to learn the most essential feature of changes in both hand-controlled and non-hand-controlled parts and convert them into text. In this paper, a feature alignment method is first employed to explicitly capture the spatial position offset and motion direction information between neighboring frames, direct a dynamic attention mechanism to focus on the subtle change region. A dynamic decoding module based on maximum backtracking probability is proposed to decode word-level features and achieve word consistency constraints without increasing computational resources. We propose a comprehensive CSLR model (DAM-MCD) that combines a Dynamic Attention Mechanism and Maximum Backtracking Probability Dynamic Decoding, enhancing the model's inference capability and robustness. Experiments conducted on two publicly accessible datasets, RWTH and RWTH-T, demonstrate that the DAM-MCD model achieves higher accuracy compared to methods using multi-cue input. The results further show that our model effectively captures sign language motion information in videos. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18631703
Volume :
19
Issue :
1
Database :
Complementary Index
Journal :
Signal, Image & Video Processing
Publication Type :
Academic Journal
Accession number :
181717868
Full Text :
https://doi.org/10.1007/s11760-024-03718-9