1. A proposed method to improve the WER of an ASR system in the noisy reverberant room.
- Author
-
Sadeghi, Mohammad Ebrahim, Sheikhzadeh, Hamid, and Emadi, Mohammad Javad
- Subjects
- *
AUTOMATIC speech recognition , *SOUND recording & reproducing , *ACOUSTIC field , *SPHERICAL harmonics , *WHITE noise - Abstract
This paper proposes a novel approach to reducing the word error rate (WER) of an automatic speech recognition (ASR) system in a noisy reverberant room. This research utilizes the integration of beamforming, dereverberation, and ambisonic. Based on the demonstrated formula, the proposed system synthesizes the signal of desired points on the sphere surface from a combination of 32 signals of a uniform spherical microphone array (USMA). This method uses the non-parametric sound field reproduction technique in the spherical harmonics domain (SHD). Also, the suggested new geometry determines the place of the desired points. In addition to improving the dereverberation performance, the proposed method also improves the performance of the beamformer in terms of directivity factor (DF) and white noise gain (WNG). The results show that objective metrics such as PESQ are significantly improved, and the WER of the Kaldi and the WeNet ASR systems is reduced considerably. • We propose a simplified formula to synthesize the sound field of a point in space. • We present a new geometry to overcome the diffuse noise on the WPE algorithm. • We propose a method to rotate the beam pattern of a fixed beamformer without deformation. • The proposed approach improves speech quality in a noisy reverberant environment. • This approach combines beamforming, dereverberation and sound field reproduction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF