Back to Search
Start Over
A survey on deep learning-based spatio-temporal action detection.
- Source :
-
International Journal of Wavelets, Multiresolution & Information Processing . Jul2024, Vol. 22 Issue 4, p1-35. 35p. - Publication Year :
- 2024
-
Abstract
- Spatio-temporal action detection (STAD) aims to classify the actions present in a video and localize them in space and time. It has become a particularly active area of research in computer vision because of its explosively emerging real-world applications, such as autonomous driving, visual surveillance and entertainment. Many efforts have been devoted in recent years to build a robust and effective framework for STAD. This paper provides a comprehensive review of the state-of-the-art deep learning-based methods for STAD. First, a taxonomy is developed to organize these methods. Next, the linking algorithms, which aim to associate the frame- or clip-level detection results together to form action tubes, are reviewed. Then, the commonly used benchmark datasets and evaluation metrics are introduced, and the performance of state-of-the-art models is compared. At last, this paper is concluded, and a set of potential research directions of STAD are discussed. [ABSTRACT FROM AUTHOR]
- Subjects :
- *COMPUTER vision
*DEEP learning
*AUTONOMOUS vehicles
Subjects
Details
- Language :
- English
- ISSN :
- 02196913
- Volume :
- 22
- Issue :
- 4
- Database :
- Academic Search Index
- Journal :
- International Journal of Wavelets, Multiresolution & Information Processing
- Publication Type :
- Academic Journal
- Accession number :
- 178557967
- Full Text :
- https://doi.org/10.1142/S0219691323500662