Start Over

Event-Oriented State Alignment Network for Weakly Supervised Temporal Language Grounding.

Authors :: Wu, Hongzhou
Zhang, Xiang
Tang, Tao
Yang, Canqun
Luo, Zhigang
Source :: Entropy. Sep2024, Vol. 26 Issue 9, p730. 17p.
Publication Year :: 2024
Abstract: Weakly supervised temporal language grounding (TLG) aims to locate events in untrimmed videos based on natural language queries without temporal annotations, necessitating a deep understanding of semantic context across both video and text modalities. Existing methods often focus on simple correlations between query phrases and isolated video segments, neglecting the event-oriented semantic coherence and consistency required for accurate temporal grounding. This can lead to misleading results due to partial frame correlations. To address these limitations, we propose the Event-oriented State Alignment Network (ESAN), which constructs "start–event–end" semantic state sets for both textual and video data. ESAN employs relative entropy for cross-modal alignment through knowledge distillation from pre-trained large models, thereby enhancing semantic coherence within each modality and ensuring consistency across modalities. Our approach leverages vision–language models to extract static frame semantics and large language models to capture dynamic semantic changes, facilitating a more comprehensive understanding of events. Experiments conducted on two benchmark datasets demonstrate that ESAN significantly outperforms existing methods. By reducing false high correlations and improving the overall performance, our method effectively addresses the challenges posed by previous approaches. These advancements highlight the potential of ESAN to improve the precision and reliability of temporal language grounding tasks. [ABSTRACT FROM AUTHOR]

Subjects :: *LANGUAGE models
*FRAMES (Linguistics)
*NATURAL languages
*ENTROPY
*DYNAMIC models
*SEMANTICS

Details

Language :: English
ISSN :: 10994300
Volume :: 26
Issue :: 9
Database :: Academic Search Index
Journal :: Entropy
Publication Type :: Academic Journal
Accession number :: 179965441
Full Text :: https://doi.org/10.3390/e26090730

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Event-Oriented State Alignment Network for Weakly Supervised Temporal Language Grounding.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Event-Oriented State Alignment Network for Weakly Supervised Temporal Language Grounding.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources