Back to Search Start Over

Improving End-to-End Contextual Speech Recognition via a Word-Matching Algorithm With Backward Search.

Authors :
Kim, Juntae
Lee, Yoonhan
Source :
IEEE Signal Processing Letters; Dec2021, p2087-2091, 5p
Publication Year :
2021

Abstract

End-to-end automatic speech recognition (E2E-ASR) prefers the common words during training rather than rare ones related to contextual information such as song names. Thus, recognizing contextual information correctly is a hurdle for E2E-ASR to reach the production-level. To overcome the limitations of E2E-ASR in recognizing contextual information, this work presents a post-processing followed by E2E-ASR in an algorithmic way, referred to as a word-matching algorithm with backward search (WMA-BS). At first, we allow E2E-ASR to roughly detect the position of target words that has similar pronunciation with desired contextual phrases. After that, given the hypothesis from E2E-ASR with the rough position of target words, WMA-BS estimates the correct target words and decides whether to replace the target words with the contextual phrase or not, according to their phonetic and literal similarity. Applying the proposed method to E2E-ASR achieved relative improvement up to 52.7% in word error rate across several harsh conditions. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10709908
Database :
Complementary Index
Journal :
IEEE Signal Processing Letters
Publication Type :
Academic Journal
Accession number :
154822550
Full Text :
https://doi.org/10.1109/LSP.2021.3117398