Back to Search Start Over

Next syllables prediction system in Dzongkha using long short-term memory

Authors :
Rattapoom Waranusast
Karma Wangchuk
Panomkhawn Riyamongkol
Source :
Journal of King Saud University - Computer and Information Sciences. 34:3800-3806
Publication Year :
2022
Publisher :
Elsevier BV, 2022.

Abstract

Dzongkha typing is time-consuming. A word in Dzongkha is formed by either a single syllable or multiple syllables. A single syllable ར (property) and multiple syllabic word ས་སག་སབས་སབསཔ (cloudy) require 6 and 22 keypresses respectively. Similarly, most of the syllables and words require several keypresses. To date, the study on syllable prediction has not been done. Moreover, the lack of text corpus poses a challenge. The purpose of this study was to develop the next syllables prediction system to reduce keystrokes and typing-time. The proposed system takes a single syllable and predicts the next top five probable syllables. The best suitable syllable is selected to form a word and subsequently, a word predicts the next plausible syllables. The corpus was curated with different genres collected from the Dzongkha Development Commission of Bhutan and Kuensel online. The dataset consisted of 31,199 sentences and 222,844 syllables. Using the n-gram method, 195,998 sequences were generated from the dataset and comprised of 2,929 unique syllables. The text sequences were converted into vectors using the word embedding and trained with the variants of Recurrent Neural Networks. The single-layer Long Short-Term Memory with 128 memory cells obtained the best training accuracy of 78.33%.

Details

ISSN :
13191578
Volume :
34
Database :
OpenAIRE
Journal :
Journal of King Saud University - Computer and Information Sciences
Accession number :
edsair.doi...........4ea460dbd67a180d012e80d7a4bba3dd