Back to Search
Start Over
Transformer-Based Automatic Punctuation Prediction and Word Casing Reconstruction of the ASR Output
- Source :
- Text, Speech, and Dialogue ISBN: 9783030835262, TDS
- Publication Year :
- 2021
- Publisher :
- Springer International Publishing, 2021.
-
Abstract
- The paper proposes a module for automatic punctuation prediction and casing reconstruction based on transformers architectures (BERT/T5) that constitutes the current state-of-the-art in many similar NLP tasks. The main motivation for our work was to increase the readability of the ASR output. The ASR output is usually in the form of a continuous stream of text, without punctuation marks and with all words in lowercase. The resulting punctuation and casing reconstruction module is evaluated on both the written text and the actual ASR output in three languages (English, Czech and Slovak).
- Subjects :
- Punctuation predictor
Czech
Word casing reconstruction
Computer science
media_common.quotation_subject
Speech recognition
Punctuation
Readability
language.human_language
T5
ASR
ComputingMethodologies_DOCUMENTANDTEXTPROCESSING
language
Casing
Word (computer architecture)
BERT
media_common
Transformer (machine learning model)
Subjects
Details
- ISBN :
- 978-3-030-83526-2
- ISBNs :
- 9783030835262
- Database :
- OpenAIRE
- Journal :
- Text, Speech, and Dialogue ISBN: 9783030835262, TDS
- Accession number :
- edsair.doi.dedup.....0afb7b3d748ec7158e437bda8c0d2185
- Full Text :
- https://doi.org/10.1007/978-3-030-83527-9_7