1. Measuring Similarity of Dual-Modal Academic Data Based on Multi-Fusion Representation Learning
- Author
-
Li Zhang, Qiang Gao, Ming Liu, Zepeng Gu, and Bo Lang
- Subjects
Scholarly big data ,deep learning ,multi fusion ,dual-modal academic data ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Nowadays, academic materials such as articles, patents, lecture notes, and observation records often use both texts and images (i.e., dual-modal data) to illustrate scientific issues. Measuring the similarity of such dual-modal academic data largely depends on dual-modal features, which is far from satisfying in practice. To learn dual-modal feature representation, most current approaches mine interactions between texts and images on top of their fusion networks. This work proposes a multi-fusion deep learning framework that learns semantically richer dual-modal representations. The framework designs multiple fusion points in the feature space of various levels, and gradually integrates the fusion information from the low-level to the high-level. In addition, we develop a multi-channel decoding network with alternate fine-tuning strategies to mine modal-specific features and cross-modal correlations thoroughly. To our knowledge, this is the first work to bring forward deep learning functions for dual-modal academic data. It reduces the semantic and statistical attribute differences between two modalities, thereby learning robust representations. A large number of experiments conducted on real-world data sets show that our method has significant performance compared with state-of-the-art approaches.
- Published
- 2024
- Full Text
- View/download PDF