Back to Search
Start Over
Multimodal sentiment analysis based on improved correlation representation network
- Source :
- International Journal of Communication Networks and Distributed Systems; 2024, Vol. 30 Issue: 6 p679-698, 20p
- Publication Year :
- 2024
-
Abstract
- Multimodal sentiment analysis (MSA) refers to extracting emotional information from language, acoustic, and visual sequences. Due to the gap between multimodal features, previous work did not take full advantage of multimodal correlations, resulting in limited improvement in fusion strategies. To this end, we propose a multimodal correlation representation network (MCRN) to extract multimodal features by a dual output transformer. The first output of the transformer is used by depth canonical correlation analysis (DCCA) to model the correlation between multimodal data. Then the secondary output of the transformer uses the attention mechanism to fuse multi-modality, and the final output of the model is emotional intensity. In addition, in the second output stage, we designed a single-mode output loss to balance the differences between subtasks. Extensive experiments suggest that our model reaches state-of-the-art performance in most of the existing methods on multimodal opinion-level sentiment intensity (MOSI) dataset and multimodal opinion sentiment and emotion intensity (MOSEI) dataset.
Details
- Language :
- English
- ISSN :
- 17543916 and 17543924
- Volume :
- 30
- Issue :
- 6
- Database :
- Supplemental Index
- Journal :
- International Journal of Communication Networks and Distributed Systems
- Publication Type :
- Periodical
- Accession number :
- ejs67558491
- Full Text :
- https://doi.org/10.1504/IJCNDS.2024.141670