Back to Search Start Over

Phone Merging for Code-switched Speech Recognition

Authors :
Sunayana Sitaram
Kalika Bali
Sunit Sivasankaran
Monojit Choudhury
Brij Mohan Lal Srivastava
Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)
Microsoft Research India [Bangalore]
Microsoft Research
collocated with ACL 2018
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Source :
Third Workshop on Computational Approaches to Linguistic Code-switching, Third Workshop on Computational Approaches to Linguistic Code-switching, collocated with ACL 2018 Jul 2018, Melbourne, Australia, CodeSwitch@ACL
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

International audience; Speakers in multilingual communities often switch between or mix multiple languages in the same conversation. Automatic Speech Recognition (ASR) of code-switched speech faces many challenges including the influence of phones of different languages on each other. This paper shows evidence that phone sharing between languages improves the Acoustic Model performance for Hindi-English code-switched speech. We compare base-line system built with separate phones for Hindi and English with systems where the phones were manually merged based on linguistic knowledge. Encouraged by the improved ASR performance after manually merging the phones, we further investigate multiple data-driven methods to identify phones to be merged across the languages. We show detailed analysis of automatic phone merging in this language pair and the impact it has on individual phone accuracies and WER. Though the best performance gain of 1.2% WER was observed with manually merged phones, we show experimentally that the manual phone merge is not optimal.

Details

Language :
English
Database :
OpenAIRE
Journal :
Third Workshop on Computational Approaches to Linguistic Code-switching, Third Workshop on Computational Approaches to Linguistic Code-switching, collocated with ACL 2018 Jul 2018, Melbourne, Australia, CodeSwitch@ACL
Accession number :
edsair.doi.dedup.....220490e7102e45d3ab1cb6d5d3814eba