Back to Search Start Over

Dynamic adjustment of language models for automatic speech recognition using word similarity

Authors :
Anna Currey
Dominique Fohr
Irina Illina
Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
ANR-12-BS02-0009,ContNomina,Exploitation du contexte pour la reconnaissance de noms propres dans les documents diachroniques audio(2012)
Fohr, Dominique
BLANC - Exploitation du contexte pour la reconnaissance de noms propres dans les documents diachroniques audio - - ContNomina2012 - ANR-12-BS02-0009 - BLANC - VALID
Source :
SLT, IEEE Workshop on Spoken Language Technology (SLT 2016), IEEE Workshop on Spoken Language Technology (SLT 2016), Dec 2016, San Diego, CA, United States
Publication Year :
2016
Publisher :
IEEE, 2016.

Abstract

International audience; Out-of-vocabulary (OOV) words can pose a particular problem for automatic speech recognition (ASR) of broadcast news. The language models (LMs) of ASR systems are typically trained on static corpora, whereas new words (particularly new proper nouns) are continually introduced in the media. Additionally, such OOVs are often content-rich proper nouns that are vital to understanding the topic. In this work, we explore methods for dynamically adding OOVs to language models by adapting the n-gram language model used in our ASR system. We propose two strategies: the first relies on finding in-vocabulary (IV) words similar to the OOVs, where word embeddings are used to define similarity. Our second strategy leverages a small contemporary corpus to estimate OOV probabilities. The models we propose yield improvements in perplexity over the baseline; in addition, the corpus-based approach leads to a significant decrease in proper noun error rate over the baseline in recognition experiments.

Details

Database :
OpenAIRE
Journal :
2016 IEEE Spoken Language Technology Workshop (SLT)
Accession number :
edsair.doi.dedup.....8ca18d1eb3609e521fb848bd293848f8
Full Text :
https://doi.org/10.1109/slt.2016.7846299