Back to Search
Start Over
Towards precise and robust automatic synchronization of live speech and its transcripts
- Source :
-
Speech Communication . Apr2011, Vol. 53 Issue 4, p508-523. 16p. - Publication Year :
- 2011
-
Abstract
- Abstract: This paper presents our efforts in automatically synchronizing spoken utterances with their transcripts (textual contents) (ASUT), where the speech is a live stream and its corresponding transcripts are known. This task is first simplified to the problem of online detecting the end times of spoken utterances and then a solution based on a novel frame-synchronous likelihood ratio test (FSLRT) procedure is proposed. We detail the formulation and implementation of the proposed FSLRT procedure under the Hidden Markov Models (HMMs) framework, and we study its property and parameter settings empirically. Because synchronization failures may occur in the FSLRT-based AUST systems, this paper also extends the FSLRT procedure to its multiple-instance version to increase the robustness of the system. The proposed multiple-instance FSLRT can detect the synchronization failures and restart the system from an appropriate point. Therefore a fully automatic FSLRT-based ASUT system could be constructed. The FSLRT-based ASUT system is evaluated in a simultaneous broadcasting news subtitling task. Experimental results show that the proposed method achieves satisfying performance and it outperforms an automatic speech recognition-based method both in terms of robustness and precision. Finally, the FSLRT-based news subtitling system can correctly subtitle about 90% of the sentences with an average time deviation of about 100ms, running at the speed of 0.37 real time (RT). [Copyright &y& Elsevier]
Details
- Language :
- English
- ISSN :
- 01676393
- Volume :
- 53
- Issue :
- 4
- Database :
- Academic Search Index
- Journal :
- Speech Communication
- Publication Type :
- Academic Journal
- Accession number :
- 59169476
- Full Text :
- https://doi.org/10.1016/j.specom.2011.01.001