Back to Search Start Over

Heterogeneous Graph Fusion Network for cross-modal image-text retrieval.

Authors :
Qin, Xueyang
Li, Lishuang
Pang, Guangyao
Hao, Fei
Source :
Expert Systems with Applications. Sep2024:Part C, Vol. 249, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Exploring the semantic correspondence of image-text pairs is significant as it bridges vision and language. Most prior works focus on global semantic alignment or local semantic alignment, by developing a fine neural network that facilitates the corresponding alignment but neglects the semantic information and relative position information between image regions, or text words, which will lead to a non-meaningful alignment. To this end, a Heterogeneous Graph Fusion Network (HGFN) is proposed to explore the correlation score of vision-language for improving the accuracy of cross-modal image-text retrieval in this paper. Specifically, we first construct an undirected fully-connected graph based on the semantic or relative position information for each image, as well as a textual graph with neighborhood information of the text. Then, we present a graph fusion module to integrate the features of heterogeneous graphs into a unified hybrid representation, in which the graph convolutional network is utilized to gather neighborhood information to alleviate potentially non-meaningful alignment. In addition, we also propose a novel "Dynamic top- K negative" strategy for the selection of negative examples in the training process. Experimental results demonstrate that HGFN achieves comparable performance with state-of-the-art approaches on the Flickr30K and MSCOCO datasets. • Cross-modal image-text retrieval problems can be treated as graph-graph matching. • A graph fusion module is designed to fuse the visual graph and textual graph. • A novel strategy for selecting negative examples is proposed. • Competitive with state-of-the-art performance on two standard benchmarks. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09574174
Volume :
249
Database :
Academic Search Index
Journal :
Expert Systems with Applications
Publication Type :
Academic Journal
Accession number :
176785359
Full Text :
https://doi.org/10.1016/j.eswa.2024.123842