Start Over

A robust and efficient method for skeleton‐based human action recognition and its application for cross‐dataset evaluation.

Authors :: Nguyen, Tien‐Thanh
Pham, Dinh‐Tan
Vu, Hai
Le, Thi‐Lan
Source :: IET Computer Vision (Wiley-Blackwell); Dec2022, Vol. 16 Issue 8, p709-726, 18p
Publication Year :: 2022
Abstract: Skeleton‐based human action recognition has emerged recently thanks to its compactness and robustness to appearance variations. Although impressive results have been obtained in recent years, the performance of skeleton‐based action recognition methods has to be improved to be deployed in real‐time applications. Recently, a lightweight network structure named Double‐feature Double‐motion Network (DD‐Net) has been proposed for the skeleton‐based human action recognition. With high speed, the DD‐Net achieves state‐of‐the‐art performance on hand and body actions. The DD‐Net could not distinguish actions if they have a weak connection with the global trajectories. However, the DD‐Net is suitable for human action recognition where actions strongly correlate to the global trajectories. In this paper, the authors propose TD‐Net, an improved version of the DD‐Net in which a new branch is added. The new branch takes the normalised coordinates of joints (NCJ) to enrich the spatial information. On five datasets for skeleton‐based human activity recognition that are MSR‐Action3D, CMDFall, JHMDB, FPHAB, and NTU RGB + D, the TD‐Net consistently obtains superior performance compared with the baseline model DD‐Net. The proposed method outperforms different state‐of‐the‐art methods, including both hand‐designed and deep learning‐based methods on four datasets (MSR‐Action3D, CMDFall, JHMDB, and FPHAB). Furthermore, the generalisation of the proposed method is confirmed through cross‐dataset evaluation. To illustrate the potential use of the model for real‐time human action recognition, the authors have deployed an application on an edge device. The experimental result shows that the application can process up to 40 fps for pose estimation using MediaPipe. It takes only 0.04 ms to recognise an action from skeleton sequences. [ABSTRACT FROM AUTHOR]

Subjects :: HUMAN activity recognition
GENERALIZATION

Details

Language :: English
ISSN :: 17519632
Volume :: 16
Issue :: 8
Database :: Complementary Index
Journal :: IET Computer Vision (Wiley-Blackwell)
Publication Type :: Academic Journal
Accession number :: 159688928
Full Text :: https://doi.org/10.1049/cvi2.12119

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A robust and efficient method for skeleton‐based human action recognition and its application for cross‐dataset evaluation.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A robust and efficient method for skeleton‐based human action recognition and its application for cross‐dataset evaluation.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources