Back to Search Start Over

Purely Attention Based Local Feature Integration for Video Classification.

Authors :
Long, Xiang
de Melo, Gerard
He, Dongliang
Li, Fu
Chi, Zhizhen
Wen, Shilei
Gan, Chuang
Source :
IEEE Transactions on Pattern Analysis & Machine Intelligence; Apr2022, Vol. 44 Issue 4, p2140-2154, 15p
Publication Year :
2022

Abstract

Recently, substantial research effort has focused on how to apply CNNs or RNNs to better capture temporal patterns in videos, so as to improve the accuracy of video classification. In this paper, we investigate the potential of a purely attention based local feature integration. Accounting for the characteristics of such features in video classification, we first propose Basic Attention Clusters (BAC), which concatenates the output of multiple attention units applied in parallel, and introduce a shifting operation to capture more diverse signals. Experiments show that BAC can achieve excellent results on multiple datasets. However, BAC treats all feature channels as an indivisible whole, which is suboptimal for achieving a finer-grained local feature integration over the channel dimension. Additionally, it treats the entire local feature sequence as an unordered set, thus ignoring the sequential relationships. To improve over BAC, we further propose the channel pyramid attention schema by splitting features into sub-features at multiple scales for coarse-to-fine sub-feature interaction modeling, and propose the temporal pyramid attention schema by dividing the feature sequences into ordered sub-sequences of multiple lengths to account for the sequential order. Our final model pyramid×pyramid attention clusters (PPAC) combines both channel pyramid attention and temporal pyramid attention to focus on the most important sub-features, while also preserving the temporal information of the video. We demonstrate the effectiveness of PPAC on seven real-world video classification datasets. Our model achieves competitive results across all of these, showing that our proposed framework can consistently outperform the existing local feature integration methods across a range of different scenarios. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01628828
Volume :
44
Issue :
4
Database :
Complementary Index
Journal :
IEEE Transactions on Pattern Analysis & Machine Intelligence
Publication Type :
Academic Journal
Accession number :
155735829
Full Text :
https://doi.org/10.1109/TPAMI.2020.3029554