Back to Search Start Over

Unsupervised multi-perspective fusing semantic alignment for cross-modal hashing retrieval.

Authors :
Chen, Yongfeng
Tan, Junpeng
Yang, Zhijing
Shi, Yukai
Qin, Jinghui
Source :
Multimedia Tools & Applications; Jul2024, Vol. 83 Issue 23, p63993-64014, 22p
Publication Year :
2024

Abstract

Due to its low computational' cost, excellent storage capacity, and efficient retrieval performance, unsupervised deep cross-modal hashing methods have received extensive attention. However, there are still some challenges with existing unsupervised methods: (1) Due to the lack of label semantics, the neighborhood structure information of unimodal and inter-modal instances may not be fully integrated, resulting in ignoring the deep semantic similarity interaction information. (2) Unsupervised hash codes can neither effectively resolve the semantic consistency between the original features of modal instances nor bridge the gap between the heterogeneous modalities of hash codes. To address these issues, we propose a new unsupervised deep cross-modal hash method called Multi-Perspective Fusing Semantic Alignment Hashing (MPFSAH). It mainly includes two aspects. Firstly, to enhance inter-modal communication, a Multi-level Semantic Similarity Interactive Measure (MSSIM) is constructed. By fusing the neighborhood structure of different modalities and increasing the distance between instances within a modality, the semantic interaction similarity can be deeply mined, to obtain discriminative semantic information. Moreover, we also propose a novel Multi-Perspective Semantic Alignment Mechanism (MPSAM). By minimizing the consistency quantization error of elements in the multi-perspective similarity, it learns the inter-modal similarity consistency. MPSAM includes similarity consistency alignment, structural-semantic alignment, and ranking alignment. It achieves structural-semantic consistency fully ensures the effective connection of cross-modal data similarities and bridges the modal gap in the process of hash codes. Through experiments on three cross-modal retrieval datasets, we demonstrate the effectiveness of our proposed method, which outperforms some state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
MODAL logic
SEMANTICS
NEIGHBORHOODS

Details

Language :
English
ISSN :
13807501
Volume :
83
Issue :
23
Database :
Complementary Index
Journal :
Multimedia Tools & Applications
Publication Type :
Academic Journal
Accession number :
178293362
Full Text :
https://doi.org/10.1007/s11042-023-18048-0