Back to Search
Start Over
First- And Third-Person Video Co-Analysis By Learning Spatial-Temporal Joint Attention
- Source :
- IEEE Transactions on Pattern Analysis and Machine Intelligence. 45:6631-6646
- Publication Year :
- 2023
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2023.
-
Abstract
- Recent years have witnessed a tremendous increasing of first-person videos captured by wearable devices. Such videos record information from different perspectives than the traditional third-person view, and thus show a wide range of potential usages. However, techniques for analyzing videos from different views can be fundamentally different, not to mention co-analyzing on both views to explore the shared information. In this paper, we take the challenge of cross-view video co-analysis and deliver a novel learning-based method. At the core of our method is the notion of "joint attention", indicating the shared attention regions that link the corresponding views, and eventually guide the shared representation learning across views. To this end, we propose a multi-branch deep network, which extracts cross-view joint attention and shared representation from static frames with spatial constraints, in a self-supervised and simultaneous manner. In addition, by incorporating the temporal transition model of the joint attention, we obtain spatial-temporal joint attention that can robustly capture the essential information extending through time. Our method outperforms the state-of-the-art on the standard cross-view video matching tasks on public datasets. Furthermore, we demonstrate how the learnt joint information can benefit various applications through a set of qualitative and quantitative experiments.
- Subjects :
- Matching (statistics)
Joint attention
Computer science
business.industry
Applied Mathematics
Transition (fiction)
Computational Theory and Mathematics
Artificial Intelligence
Human–computer interaction
Computer Vision and Pattern Recognition
Artificial intelligence
Set (psychology)
Representation (mathematics)
Joint (audio engineering)
business
Feature learning
Software
Wearable technology
Subjects
Details
- ISSN :
- 19393539 and 01628828
- Volume :
- 45
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- Accession number :
- edsair.doi.dedup.....3507920f31a2ef2820cdcbb2cfb45c09
- Full Text :
- https://doi.org/10.1109/tpami.2020.3030048