Start Over

Local-Global Graph Pooling via Mutual Information Maximization for Video-Paragraph Retrieval.

Authors :: Zhang, Pengcheng
Zhao, Zhou
Wang, Nannan
Yu, Jun
Wu, Fei
Source :: IEEE Transactions on Circuits & Systems for Video Technology. Oct2022, Vol. 32 Issue 10, p7133-7146. 14p.
Publication Year :: 2022
Abstract: As a task of cross-modal retrieval between long videos and paragraphs, video-paragraph retrieval is a non-trivial task. Unlike traditional video-text retrieval, the video in video-paragraph retrieval usually contains multiple clips. Each clip corresponds to a descriptive sentence; all the sentences constitute the corresponding paragraph of the video. Previous methods for video-paragraph retrieval usually encode videos and para-graphs from segment-level (clips and sentences) and overall-level (videos and paragraphs). However, there are also contents about actions and objects that exist in the segment. Hence, we propose a Local-Global Graph Pooling Network (LGGP) via Mutual Information Maximization for video-paragraph retrieval. Our model disentangles videos and paragraphs into four levels: overall-level, segment-level, motion-level, and object-level. We construct the Hierarchical Local Graph (segment-level, motion-level, and object-level) and the Hierarchical Global Graph (overall-level, segment-level, motion-level, and object-level), respectively, for semantic interaction among different levels. Meanwhile, to obtain hierarchical pooling features with fine-grained semantic information, we design hierarchical graph pooling methods to maximize the mutual information between pooling features and corresponding graph nodes. We evaluate our model on two video-paragraph retrieval datasets with three different video features. The experimental results show that our model establishes state-of-the-art results for video-paragraph retrieval. Our code will be released at https://github.com/PengchengZhang1997/LGGP. [ABSTRACT FROM AUTHOR]

Subjects :: *VIDEOS
*FEATURE extraction

Details

Language :: English
ISSN :: 10518215
Volume :: 32
Issue :: 10
Database :: Academic Search Index
Journal :: IEEE Transactions on Circuits & Systems for Video Technology
Publication Type :: Academic Journal
Accession number :: 160693873
Full Text :: https://doi.org/10.1109/TCSVT.2022.3176866

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Local-Global Graph Pooling via Mutual Information Maximization for Video-Paragraph Retrieval.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Local-Global Graph Pooling via Mutual Information Maximization for Video-Paragraph Retrieval.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources