Back to Search Start Over

Emotional Voice Messages (EMOVOME) database: emotion recognition in spontaneous voice messages

Authors :
Zaragozá, Lucía Gómez
del Amor, Rocío
Vargas, Elena Parra
Naranjo, Valery
Raya, Mariano Alcañiz
Marín-Morales, Javier
Publication Year :
2024

Abstract

Emotional Voice Messages (EMOVOME) is a spontaneous speech dataset containing 999 audio messages from real conversations on a messaging app from 100 Spanish speakers, gender balanced. Voice messages were produced in-the-wild conditions before participants were recruited, avoiding any conscious bias due to laboratory environment. Audios were labeled in valence and arousal dimensions by three non-experts and two experts, which were then combined to obtain a final label per dimension. The experts also provided an extra label corresponding to seven emotion categories. To set a baseline for future investigations using EMOVOME, we implemented emotion recognition models using both speech and audio transcriptions. For speech, we used the standard eGeMAPS feature set and support vector machines, obtaining 49.27% and 44.71% unweighted accuracy for valence and arousal respectively. For text, we fine-tuned a multilingual BERT model and achieved 61.15% and 47.43% unweighted accuracy for valence and arousal respectively. This database will significantly contribute to research on emotion recognition in the wild, while also providing a unique natural and freely accessible resource for Spanish.<br />Comment: This paper has been superseded by arXiv:2403.02167 (merged from the description of the EMOVOME database in arXiv:2402.17496v1 and the speech emotion recognition models in arXiv:2403.02167v1)

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2402.17496
Document Type :
Working Paper