Back to Search Start Over

Efficient data-driven behavior identification based on vision transformers for human activity understanding.

Authors :
Yang, Jiachen
Zhang, Zhuo
Xiao, Shuai
Ma, Shukun
Li, Yang
Lu, Wen
Gao, Xinbo
Source :
Neurocomputing. Apr2023, Vol. 530, p104-115. 12p.
Publication Year :
2023

Abstract

• We focus on the data dilemma encountered in the field of human activity understanding, solve practical application problems from a new perspective, and use the proposed method to reduce the model's dependence on data. • We construct a human physical activity dataset containing 10 categories Human SA-10 for use in human activity understanding research. • A Core-Weight Entropy data information evaluation method that can be applied to human behavior recognition tasks is proposed. On Human SA-10, our method can reduce data usage by 50%. Compared to other methods, this method achieved state-of-the-art performance. • In addition, we propose a new unlabeled data redundancy information removal module, which effectively avoids introducing similar data into the training set. With the development of computer vision, the research on human activity understanding has been greatly promoted. The recognition algorithm based on vision transformer has made some achievements in a large number of computer vision tasks, but it still needs to be driven by a large amount of data. How to get rid of the constraints of large amounts of data is crucial for human behavior recognition based on vision transformer. This paper focuses on solving the dilemma of big data, and tries to achieve a high-performance model through a small amount of high information human activity data. The advantage of our work is that by studying feature distribution, we proposed a core weight entropy data information evaluation method for obtaining high information data, and through redundant information elimination strategy, we can avoid introducing similar data. A large number of experimental results show the effectiveness of the proposed method. Compared with existing methods, our method reduces the data consumption by 5% to 30%, and can achieve the performance of using only 50% of 100% data. More importantly, the data our method selected has no redundancy, which is not available in other methods. In addition, we carried out a large number of ablation experiments to prove the rationality of the method. The work of this paper solves the challenge of relying on a large amount of data when using the visual converter to recognize human behavior, which is of practical significance for realizing efficient human activity understanding research with low data. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
530
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
162130839
Full Text :
https://doi.org/10.1016/j.neucom.2023.01.067