Back to Search Start Over

A variance modeling framework based on variational autoencoders for speech enhancement

Authors :
Leglaive, Simon
Girin, Laurent
Horaud, Radu
Source :
Proc. of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, September 2018
Publication Year :
2019

Abstract

In this paper we address the problem of enhancing speech signals in noisy mixtures using a source separation approach. We explore the use of neural networks as an alternative to a popular speech variance model based on supervised non-negative matrix factorization (NMF). More precisely, we use a variational autoencoder as a speaker-independent supervised generative speech model, highlighting the conceptual similarities that this approach shares with its NMF-based counterpart. In order to be free of generalization issues regarding the noisy recording environments, we follow the approach of having a supervised model only for the target speech signal, the noise model being based on unsupervised NMF. We develop a Monte Carlo expectation-maximization algorithm for inferring the latent variables in the variational autoencoder and estimating the unsupervised model parameters. Experiments show that the proposed method outperforms a semi-supervised NMF baseline and a state-of-the-art fully supervised deep learning approach.<br />Comment: 6 pages, 3 figures

Details

Database :
arXiv
Journal :
Proc. of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP), Aalborg, Denmark, September 2018
Publication Type :
Report
Accession number :
edsarx.1902.01605
Document Type :
Working Paper
Full Text :
https://doi.org/10.1109/MLSP.2018.8516711