Back to Search
Start Over
Learning a strong detector for action localization in videos
- Source :
- Pattern Recognition Letters. 128:407-413
- Publication Year :
- 2019
- Publisher :
- Elsevier BV, 2019.
-
Abstract
- We address the problem of spatio-temporal action localization in videos in this paper. Current state-of-the-art methods for this challenging task rely on an object detector to localize actors at frame-level firstly, and then link or track the detections across time. Most of these methods commonly pay more attention to leveraging the temporal context of videos for action detection while ignoring the importance of the object detector itself. In this paper, we prove the importance of the object detector in the pipeline of action localization, and propose a strong object detector for better action localization in videos, which is based on the single shot multibox detector (SSD) framework. Different from SSD, we introduce an anchor refine branch at the end of the backbone network to refine the input anchors, and add a batch normalization layer before concatenating the intermediate feature maps at frame-level and after stacking feature maps at clip-level. The proposed strong detector have two contributions: (1) reducing the phenomenon of missing target objects at frame-level; (2) generating deformable anchor cuboids for modeling temporal dynamic actions. Extensive experiments on UCF-Sports, J-HMDB and UCF-101 validate our claims, and we outperform the previous state-of-the-art methods by a large margin in terms of frame-mAP and video-mAP, especially at a higher overlap threshold.
- Subjects :
- Normalization (statistics)
business.industry
Computer science
Pipeline (computing)
Detector
ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION
Normalization (image processing)
02 engineering and technology
01 natural sciences
Task (computing)
Action (philosophy)
Artificial Intelligence
Margin (machine learning)
Feature (computer vision)
0103 physical sciences
Signal Processing
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Computer vision
Computer Vision and Pattern Recognition
Artificial intelligence
010306 general physics
business
Software
Subjects
Details
- ISSN :
- 01678655
- Volume :
- 128
- Database :
- OpenAIRE
- Journal :
- Pattern Recognition Letters
- Accession number :
- edsair.doi...........3812c67c9dad29f9dff8b794143dcaec