Back to Search Start Over

Two-pass Endpoint Detection for Speech Recognition

Authors :
Raju, Anirudh
Khare, Aparna
He, Di
Sklyar, Ilya
Chen, Long
Alptekin, Sam
Trinh, Viet Anh
Zhang, Zhe
Vaz, Colin
Ravichandran, Venkatesh
Maas, Roland
Rastrow, Ariya
Publication Year :
2024

Abstract

Endpoint (EP) detection is a key component of far-field speech recognition systems that assist the user through voice commands. The endpoint detector has to trade-off between accuracy and latency, since waiting longer reduces the cases of users being cut-off early. We propose a novel two-pass solution for endpointing, where the utterance endpoint detected from a first pass endpointer is verified by a 2nd-pass model termed EP Arbitrator. Our method improves the trade-off between early cut-offs and latency over a baseline endpointer, as tested on datasets including voice-assistant transactional queries, conversational speech, and the public SLURP corpus. We demonstrate that our method shows improvements regardless of the first-pass EP model used.<br />Comment: ASRU 2023

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2401.08916
Document Type :
Working Paper
Full Text :
https://doi.org/10.1109/ASRU57964.2023.10389743