Back to Search
Start Over
Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi
- Source :
- ACL
- Publication Year :
- 2020
- Publisher :
- arXiv, 2020.
-
Abstract
- Hindi grapheme-to-phoneme (G2P) conversion is mostly trivial, with one exception: whether a schwa represented in the orthography is pronounced or unpronounced (deleted). Previous work has attempted to predict schwa deletion in a rule-based fashion using prosodic or phonetic analysis. We present the first statistical schwa deletion classifier for Hindi, which relies solely on the orthography as the input and outperforms previous approaches. We trained our model on a newly-compiled pronunciation lexicon extracted from various online dictionaries. Our best Hindi model achieves state of the art performance, and also achieves good performance on a closely related language, Punjabi, without modification.<br />Comment: 4 pages, 1 figure. To be published in the 2020 Annual Conference of the Association for Computational Linguistics (https://acl2020.org/)
- Subjects :
- Hindi
FOS: Computer and information sciences
Computer Science - Computation and Language
Computer science
business.industry
I.2.7
Orthographic projection
Grapheme
Pronunciation
Lexicon
computer.software_genre
language.human_language
Classifier (linguistics)
language
Schwa
Artificial intelligence
business
computer
Computation and Language (cs.CL)
Orthography
Natural language processing
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- ACL
- Accession number :
- edsair.doi.dedup.....d32183f6e67a763a0b70122466081363
- Full Text :
- https://doi.org/10.48550/arxiv.2004.10353