Back to Search Start Over

DOA-informed switching independent vector extraction and beamforming for speech enhancement in underdetermined situations

Authors :
Tetsuya Ueda
Tomohiro Nakatani
Rintaro Ikeshita
Shoko Araki
Shoji Makino
Source :
EURASIP Journal on Audio, Speech, and Music Processing, Vol 2024, Iss 1, Pp 1-20 (2024)
Publication Year :
2024
Publisher :
SpringerOpen, 2024.

Abstract

Abstract This paper proposes novel methods for extracting a single Speech signal of Interest (SOI) from a multichannel observed signal in underdetermined situations, i.e., when the observed signal contains more speech signals than microphones. It focuses on extracting the SOI using prior knowledge of the SOI’s Direction of Arrival (DOA). Conventional beamformers (BFs) and Blind Source Separation (BSS) with spatial regularization struggle to suppress interference speech signals in such situations. Although Switching Minimum Power Distortionless Response BF (Sw-MPDR) can handle underdetermined situations using a switching mechanism, its estimation accuracy significantly decreases when it relies on a steering vector determined by the SOI’s DOA. Spatially-Regularized Independent Vector Extraction (SRIVE) can robustly enhance the SOI based solely on its DOA using spatial regularization, but its performance degrades in underdetermined situations. This paper extends these conventional methods to overcome their limitations. First, we introduce a time-varying Gaussian (TVG) source model to Sw-MPDR to effectively enhance the SOI based solely on the DOA. Second, we introduce the switching mechanism to SRIVE to improve its speech enhancement performance in underdetermined situations. These two proposed methods are called Switching weighted MPDR (Sw-wMPDR) and Switching SRIVE (Sw-SRIVE). We experimentally demonstrate that both surpass conventional methods in enhancing the SOI using the DOA in underdetermined situations.

Details

Language :
English
ISSN :
16874722
Volume :
2024
Issue :
1
Database :
Directory of Open Access Journals
Journal :
EURASIP Journal on Audio, Speech, and Music Processing
Publication Type :
Academic Journal
Accession number :
edsdoj.6dfe3191c04d4e8c5ae94f52bbb48e
Document Type :
article
Full Text :
https://doi.org/10.1186/s13636-024-00373-3