Back to Search
Start Over
Metric Learning-Based Multimodal Audio-Visual Emotion Recognition.
- Source :
- IEEE MultiMedia; Jan/Mar2020, Vol. 27 Issue 1, p37-48, 12p
- Publication Year :
- 2020
-
Abstract
- People express their emotions through multiple channels, such as visual and audio ones. Consequently, automatic emotion recognition can be significantly benefited by multimodal learning. Even-though each modality exhibits unique characteristics; multimodal learning takes advantage of the complementary information of diverse modalities when measuring the same instance, resulting in enhanced understanding of emotions. Yet, their dependencies and relations are not fully exploited in audio–video emotion recognition. Furthermore, learning an effective metric through multimodality is a crucial goal for many applications in machine learning. Therefore, in this article, we propose multimodal emotion recognition metric learning (MERML), learned jointly to obtain a discriminative score and a robust representation in a latent-space for both modalities. The learned metric is efficiently used through the radial basis function (RBF) based support vector machine (SVM) kernel. The evaluation of our framework shows a significant performance, improving the state-of-the-art results on the eNTERFACE and CREMA-D datasets. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 1070986X
- Volume :
- 27
- Issue :
- 1
- Database :
- Complementary Index
- Journal :
- IEEE MultiMedia
- Publication Type :
- Academic Journal
- Accession number :
- 142470898
- Full Text :
- https://doi.org/10.1109/MMUL.2019.2960219