Start Over

A Multi-scale Subconvolutional U-Net with Time-Frequency Attention Mechanism for Single Channel Speech Enhancement.

Authors :: Yechuri, Sivaramakrishna
Komati, Thirupathi Rao
Yellapragada, Rama Krishna
Vanambathina, Sunnydaya
Source :: Circuits, Systems & Signal Processing. Sep2024, Vol. 43 Issue 9, p5682-5710. 29p.
Publication Year :: 2024
Abstract: Recent advancements in deep learning-based speech enhancement models have extensively used attention mechanisms to achieve state-of-the-art methods by demonstrating their effectiveness. This paper proposes a novel time-frequency attention (TFA) for speech enhancement that includes a multi-scale subconvolutional U-Net (MSCUNet). The TFA extracts valuable channels, frequencies, and time information from the feature sets and improves speech intelligibility and quality. Channel attention is first performed in TFA to learn weights representing the channels' importance in the input feature set, followed by frequency and time attention mechanisms that are performed simultaneously, using learned weights, to capture both frequency and time attention. Additionally, a U-Net based multi-scale subconvolutional encoder-decoder model used different kernel sizes to extract local and contextual features from the noisy speech. The MSCUNet uses a feature calibration block acting as a gating network to control the information flow among the layers. This enables the scaled features to be weighted in order to retain speech and suppress the noise. Additionally, central layers are employed to exploit the interdependency among the past, current, and future frames to improve predictions. The experimental results show that the proposed TFAMSCUNet mode outperforms several state-of-the-art methods. [ABSTRACT FROM AUTHOR]

Subjects :: *SPEECH enhancement
*SPEECH
*INFORMATION networks
*INTELLIGIBILITY of speech
*INFORMATION resources management
*CALIBRATION

Details

Language :: English
ISSN :: 0278081X
Volume :: 43
Issue :: 9
Database :: Academic Search Index
Journal :: Circuits, Systems & Signal Processing
Publication Type :: Academic Journal
Accession number :: 179041755
Full Text :: https://doi.org/10.1007/s00034-024-02721-2

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

A Multi-scale Subconvolutional U-Net with Time-Frequency Attention Mechanism for Single Channel Speech Enhancement.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

A Multi-scale Subconvolutional U-Net with Time-Frequency Attention Mechanism for Single Channel Speech Enhancement.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources