Start Over

Multi-Path and Group-Loss-Based Network for Speech Emotion Recognition in Multi-Domain Datasets.

Authors :: Noh KJ
Jeong CY
Lim J
Chung S
Kim G
Lim JM
Jeong H
Source :: Sensors (Basel, Switzerland) [Sensors (Basel)] 2021 Feb 24; Vol. 21 (5). Date of Electronic Publication: 2021 Feb 24.
Publication Year :: 2021
Abstract: Speech emotion recognition (SER) is a natural method of recognizing individual emotions in everyday life. To distribute SER models to real-world applications, some key challenges must be overcome, such as the lack of datasets tagged with emotion labels and the weak generalization of the SER model for an unseen target domain. This study proposes a multi-path and group-loss-based network (MPGLN) for SER to support multi-domain adaptation. The proposed model includes a bidirectional long short-term memory-based temporal feature generator and a transferred feature extractor from the pre-trained VGG-like audio classification model (VGGish), and it learns simultaneously based on multiple losses according to the association of emotion labels in the discrete and dimensional models. For the evaluation of the MPGLN SER as applied to multi-cultural domain datasets, the Korean Emotional Speech Database (KESD), including KESDy18 and KESDy19, is constructed, and the English-speaking Interactive Emotional Dyadic Motion Capture database (IEMOCAP) is used. The evaluation of multi-domain adaptation and domain generalization showed 3.7% and 3.5% improvements, respectively, of the F1 score when comparing the performance of MPGLN SER with a baseline SER model that uses a temporal feature generator. We show that the MPGLN SER efficiently supports multi-domain adaptation and reinforces model generalization.

Subjects :: Humans
Databases, Factual
Emotions classification
Machine Learning
Pattern Recognition, Automated
Speech

Details

Language :: English
ISSN :: 1424-8220
Volume :: 21
Issue :: 5
Database :: MEDLINE
Journal :: Sensors (Basel, Switzerland)
Publication Type :: Academic Journal
Accession number :: 33668254
Full Text :: https://doi.org/10.3390/s21051579

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Multi-Path and Group-Loss-Based Network for Speech Emotion Recognition in Multi-Domain Datasets.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Multi-Path and Group-Loss-Based Network for Speech Emotion Recognition in Multi-Domain Datasets.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources