Back to Search
Start Over
Local Correspondence Network for Weakly Supervised Temporal Sentence Grounding.
- Source :
-
IEEE Transactions on Image Processing . 2021, Vol. 30, p3252-3262. 11p. - Publication Year :
- 2021
-
Abstract
- Weakly supervised temporal sentence grounding has better scalability and practicability than fully supervised methods in real-world application scenarios. However, most of existing methods cannot model the fine-grained video-text local correspondences well and do not have effective supervision information for correspondence learning, thus yielding unsatisfying performance. To address the above issues, we propose an end-to-end Local Correspondence Network (LCNet) for weakly supervised temporal sentence grounding. The proposed LCNet enjoys several merits. First, we represent video and text features in a hierarchical manner to model the fine-grained video-text correspondences. Second, we design a self-supervised cycle-consistent loss as a learning guidance for video and text matching. To the best of our knowledge, this is the first work to fully explore the fine-grained correspondences between video and text for temporal sentence grounding by using self-supervised learning. Extensive experimental results on two benchmark datasets demonstrate that the proposed LCNet significantly outperforms existing weakly supervised methods. [ABSTRACT FROM AUTHOR]
- Subjects :
- *SUPERVISED learning
*FEATURE extraction
*TASK analysis
Subjects
Details
- Language :
- English
- ISSN :
- 10577149
- Volume :
- 30
- Database :
- Academic Search Index
- Journal :
- IEEE Transactions on Image Processing
- Publication Type :
- Academic Journal
- Accession number :
- 170077684
- Full Text :
- https://doi.org/10.1109/TIP.2021.3058614