Back to Search Start Over

Research on Speech Enhancement Algorithm by Fusing Improved EMD and GCRN Networks.

Authors :
Lan, Chaofeng
Chen, Huan
Zhang, Lei
Zhao, Shilong
Guo, Rui
Fan, Zixu
Source :
Circuits, Systems & Signal Processing; Jul2024, Vol. 43 Issue 7, p4588-4604, 17p
Publication Year :
2024

Abstract

Under the condition of low signal-to-noise ratio, for the problem of insufficient speech feature extraction and speech enhancement effect of the traditional neural network, this paper is based on empirical mode decomposition (EMD), temporal convolutional network (TCN), and gated convolution recurrent neural network (GCRN), while combining with feature fusion module (FFM), the adaptive mean median-empirical mode decomposition-multilayer gated feature fusion module convolutional recurrent neural networks (ME-MGFCRNs) for speech enhancement modeling. The network model uses a split-frequency learning strategy to learn low-frequency features and high-frequency features, i.e., the TCN and MGFCRN networks are used to obtain low-frequency and high-frequency features, and FFM processes the two sets of features to achieve speech enhancement in the form of feature mapping. The model proposed in this paper performs ablation and comparison experiments on the dataset to evaluate the enhancement effect of speech using PESQ, FwSegSNR, and STOI metrics. The research shows that under different noise environments and SNR conditions, the model proposed in this paper improves compared with other baseline models, especially under the low SNR condition of − 5 dB, FwSegSNR and PESQ improve by more than 0.86 dB and 0.02 compared with other baseline models. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0278081X
Volume :
43
Issue :
7
Database :
Complementary Index
Journal :
Circuits, Systems & Signal Processing
Publication Type :
Academic Journal
Accession number :
178461768
Full Text :
https://doi.org/10.1007/s00034-024-02677-3