Back to Search Start Over

A Multi-scale Subconvolutional U-Net with Time-Frequency Attention Mechanism for Single Channel Speech Enhancement.

Authors :
Yechuri, Sivaramakrishna
Komati, Thirupathi Rao
Yellapragada, Rama Krishna
Vanambathina, Sunnydaya
Source :
Circuits, Systems & Signal Processing. Sep2024, Vol. 43 Issue 9, p5682-5710. 29p.
Publication Year :
2024

Abstract

Recent advancements in deep learning-based speech enhancement models have extensively used attention mechanisms to achieve state-of-the-art methods by demonstrating their effectiveness. This paper proposes a novel time-frequency attention (TFA) for speech enhancement that includes a multi-scale subconvolutional U-Net (MSCUNet). The TFA extracts valuable channels, frequencies, and time information from the feature sets and improves speech intelligibility and quality. Channel attention is first performed in TFA to learn weights representing the channels' importance in the input feature set, followed by frequency and time attention mechanisms that are performed simultaneously, using learned weights, to capture both frequency and time attention. Additionally, a U-Net based multi-scale subconvolutional encoder-decoder model used different kernel sizes to extract local and contextual features from the noisy speech. The MSCUNet uses a feature calibration block acting as a gating network to control the information flow among the layers. This enables the scaled features to be weighted in order to retain speech and suppress the noise. Additionally, central layers are employed to exploit the interdependency among the past, current, and future frames to improve predictions. The experimental results show that the proposed TFAMSCUNet mode outperforms several state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0278081X
Volume :
43
Issue :
9
Database :
Academic Search Index
Journal :
Circuits, Systems & Signal Processing
Publication Type :
Academic Journal
Accession number :
179041755
Full Text :
https://doi.org/10.1007/s00034-024-02721-2