201. Self-attention and adversary learning deep hashing network for cross-modal retrieval.
- Author
-
Chen, Shubai, Wu, Song, Wang, Li, and Yu, Zhenyang
- Subjects
- *
DEEP learning , *HUMAN-computer interaction , *SOURCE code , *INFORMATION retrieval , *LOCAL government - Abstract
Multi-modal information retrieval is among the prevailing forms of daily human–computer interaction. The recent deep cross-modal hashing methods have received increasing attention because of their superior search performance and efficiency capability. However, effectively exploring the high-ranking semantic correlation and preserving representation consistency are still challengeable due to the heterogeneity property of different modalities. In this paper, a Self-Attention and Adversary Learning Hashing Network (SAALDH) is designed for large scale cross-modal retrieval. Specifically, the hash representations across different layers of the deep network are integrated and then the significance of each position in the integrated hash representation is enhanced by employing a novel self-attention mechanism. Meanwhile, an adversarial learning mechanism is adopted to further preserve the consistency of hash representations during hash functions learning. Moreover, a novel batch semi-hard selection is designed for triplet loss to solve the issue of local optimum during the optimization of SAALDH. Experimental results evaluated on two large scale image-text modality datasets show the effectiveness and efficiency of the proposed SAALDH. And SAALDH achieves better performances by comparing with several state-of-the-art methods. The source code URL of our SAALDH is: http://github.com/SWU-CS-MediaLab/SAALDH. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF