Back to Search
Start Over
Learning Video Moment Retrieval Without a Single Annotated Video
- Source :
- IEEE Transactions on Circuits and Systems for Video Technology. 32:1646-1657
- Publication Year :
- 2022
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2022.
-
Abstract
- Video moment retrieval has progressed significantly over the past few years, aiming to search the moment that is most relevant to a given natural language query. Most existing methods are trained in a fully-supervised or a weakly-supervised manner, which requires a time-consuming and expensive manually labeling process. In this work, we propose an alternative approach to achieving video moment retrieval that requires no textual annotations of videos and instead leverages the existing visual concept detectors and a pre-trained image-sentence embedding space. Specifically, we design a video-conditioned sentence generator to produce a suitable sentence representation by utilizing the mined visual concepts in videos. We then design a GNN-based relation-aware moment localizer to reasonably select a portion of video clips under the guidance of the generated sentence. Finally, the pre-trained image-sentence embedding space is adopted to evaluate the matching scores between the generated sentence and moment representations with the knowledge transferred from the image domain. By maximizing these scores, the sentence generator and moment localizer can enhance and complement each other to achieve the moment retrieval task. Experimental results on the Charades-STA and ActivityNet Captions datasets demonstrate the effectiveness of our proposed method.
- Subjects :
- Matching (graph theory)
Natural language user interface
Computer science
business.industry
computer.software_genre
Task (project management)
Moment (mathematics)
Media Technology
Embedding
Artificial intelligence
Electrical and Electronic Engineering
Representation (mathematics)
business
computer
Sentence
Natural language processing
Generator (mathematics)
Subjects
Details
- ISSN :
- 15582205 and 10518215
- Volume :
- 32
- Database :
- OpenAIRE
- Journal :
- IEEE Transactions on Circuits and Systems for Video Technology
- Accession number :
- edsair.doi...........fbf4ecc1557349bbc53e7bb0a829ac8e