Back to Search Start Over

Cross-Modal Entity Resolution Based on Co-Attentional Generative Adversarial Network

Authors :
Jianjun Cao
Chen Chang
Guojun Lv
Qibin Zheng
Source :
Proceedings of the 2019 4th International Conference on Multimedia Systems and Signal Processing.
Publication Year :
2019
Publisher :
ACM, 2019.

Abstract

Cross-modal entity resolution aims to find semantically similar items from objects of different modalities(e.g. image and text). The core way to solve the problem is to construct a shared space where multi-modal examples can be represented uniformly. In this paper, we propose a novel Co-Attentional Generative Adversarial Network(CAGAN) method for solving cross-modal entity resolution, which seeks an effective space based on co-attention mechanism and adversarial learning. The generative adversarial network that we design contains two parts, Generator and Discriminator, the generator aims to generate a shared space through intra-modal loss and inter-modal loss, while discriminator is a classifier which tries to discriminate the modalities based on the generated representation. In order to eliminate the imbalance of information between modalities, generate more consistent representation and accelerate the convergence speed of the network, co-attention mechanism is introduced into the network. Experimental results performed on two cross-modal datasets demonstrated the outstanding performance of the proposed method for cross-modal entity resolution.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 2019 4th International Conference on Multimedia Systems and Signal Processing
Accession number :
edsair.doi...........e096eee14b4bfabcbb81d0c1d71aa9bb
Full Text :
https://doi.org/10.1145/3330393.3330417