Back to Search
Start Over
Heterogeneous Graph Fusion Network for cross-modal image-text retrieval.
- Source :
-
Expert Systems with Applications . Sep2024:Part C, Vol. 249, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- Exploring the semantic correspondence of image-text pairs is significant as it bridges vision and language. Most prior works focus on global semantic alignment or local semantic alignment, by developing a fine neural network that facilitates the corresponding alignment but neglects the semantic information and relative position information between image regions, or text words, which will lead to a non-meaningful alignment. To this end, a Heterogeneous Graph Fusion Network (HGFN) is proposed to explore the correlation score of vision-language for improving the accuracy of cross-modal image-text retrieval in this paper. Specifically, we first construct an undirected fully-connected graph based on the semantic or relative position information for each image, as well as a textual graph with neighborhood information of the text. Then, we present a graph fusion module to integrate the features of heterogeneous graphs into a unified hybrid representation, in which the graph convolutional network is utilized to gather neighborhood information to alleviate potentially non-meaningful alignment. In addition, we also propose a novel "Dynamic top- K negative" strategy for the selection of negative examples in the training process. Experimental results demonstrate that HGFN achieves comparable performance with state-of-the-art approaches on the Flickr30K and MSCOCO datasets. • Cross-modal image-text retrieval problems can be treated as graph-graph matching. • A graph fusion module is designed to fuse the visual graph and textual graph. • A novel strategy for selecting negative examples is proposed. • Competitive with state-of-the-art performance on two standard benchmarks. [ABSTRACT FROM AUTHOR]
- Subjects :
- *PERFORMANCE standards
*VISUAL cryptography
Subjects
Details
- Language :
- English
- ISSN :
- 09574174
- Volume :
- 249
- Database :
- Academic Search Index
- Journal :
- Expert Systems with Applications
- Publication Type :
- Academic Journal
- Accession number :
- 176785359
- Full Text :
- https://doi.org/10.1016/j.eswa.2024.123842