Back to Search
Start Over
Human–robot interaction-oriented video understanding of human actions.
- Source :
-
Engineering Applications of Artificial Intelligence . Jul2024:Part A, Vol. 133, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- This paper focuses on action recognition tasks oriented to the field of human–robot interaction, which is one of the major challenges in the robotic video understanding field. Previous approaches focus on designing temporal models, lack the ability to capture motion information and build contextual correlation models. This may result in robots being unable to effectively understand long-term video actions. To solve these two problems, this paper propose a novel video understanding framework including: an Adaptive Temporal Sensitivity and Motion Capture Network (ATSMC-Net) and a contextual scene reasoning module called Knowledge Function Graph Module (KFG-Module). The proposed ATSMC-Net can adaptively adjust the frame-level and pixel-level sensitive regions of temporal features to effectively capture motion information. To fuse contextual scene information for cross-temporal inference, the KFG-Module is introduced to achieve fine-grained video understanding based on the relationship between objects and actions. We evaluate the method using three public video understanding benchmarks, including Something-Something-V1&V2 and HMDB51. In addition, we present a dataset with real-world application scenarios of human–robot interactions to verify the effectiveness of our approach on mobile robots. The experimental results show that the proposed method can significantly improve the video understanding of the robots. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 09521976
- Volume :
- 133
- Database :
- Academic Search Index
- Journal :
- Engineering Applications of Artificial Intelligence
- Publication Type :
- Academic Journal
- Accession number :
- 177605573
- Full Text :
- https://doi.org/10.1016/j.engappai.2024.108247