Back to Search Start Over

Speech Emotion Recognition on Small Sample Learning by Hybrid WGAN-LSTM Networks.

Authors :
Sun, Cunwei
Ji, Luping
Zhong, Hailing
Source :
Journal of Circuits, Systems & Computers. 2022, Vol. 31 Issue 4, p1-15. 15p.
Publication Year :
2022

Abstract

The speech emotion recognition based on the deep networks on small samples is often a very challenging problem in natural language processing. The massive parameters of a deep network are much difficult to be trained reliably on small-quantity speech samples. Aiming at this problem, we propose a new method through the systematical cooperation of Generative Adversarial Network (GAN) and Long Short Term Memory (LSTM). In this method, it utilizes the adversarial training of GAN's generator and discriminator on speech spectrogram images to implement sufficient sample augmentation. A six-layer convolution neural network (CNN), followed in series by a two-layer LSTM, is designed to extract features from speech spectrograms. For accelerating the training of networks, the parameters of discriminator are transferred to our feature extractor. By the sample augmentation, a well-trained feature extraction network and an efficient classifier could be achieved. The tests and comparisons on two publicly available datasets, i.e., EMO-DB and IEMOCAP, show that our new method is effective, and it is often superior to some state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02181266
Volume :
31
Issue :
4
Database :
Academic Search Index
Journal :
Journal of Circuits, Systems & Computers
Publication Type :
Academic Journal
Accession number :
155781936
Full Text :
https://doi.org/10.1142/S0218126622500736