Back to Search Start Over

Image-Text Joint Learning for Social Images with Spatial Relation Model

Authors :
Feng Jiangfan
Xiaobo Luo
Yuling Zhu
Fu Xuejun
Yao Zhou
Source :
Complexity, Vol 2020 (2020)
Publication Year :
2020
Publisher :
Hindawi Limited, 2020.

Abstract

The rapid developments in sensor technology and mobile devices bring a flourish of social images, and large-scale social images have attracted increasing attention to researchers. Existing approaches generally rely on recognizing object instances individually with geo-tags, visual patterns, etc. However, the social image represents a web of interconnected relations; these relations between entities carry semantic meaning and help a viewer differentiate between instances of a substance. This article forms the perspective of the spatial relationship to exploring the joint learning of social images. Precisely, the model consists of three parts: (a) a module for deep semantic understanding of images based on residual network (ResNet); (b) a deep semantic analysis module of text beyond traditional word bag methods; (c) a joint reasoning module from which the text weights obtained using image features on self-attention and a novel tree-based clustering algorithm. The experimental results demonstrate the effectiveness of using Flickr30k and Microsoft COCO datasets. Meanwhile, our method considers spatial relations while matching.

Details

ISSN :
10990526 and 10762787
Volume :
2020
Database :
OpenAIRE
Journal :
Complexity
Accession number :
edsair.doi.dedup.....368da8820bba4017e2146ddca0d84112