1. Part of speech Tagging for Myanmar using Hidden Markov Model
- Author
-
Ni Lar Thein and Khine Khine Zin
- Subjects
business.industry ,Computer science ,Speech recognition ,Text segmentation ,Context (language use) ,Speech processing ,computer.software_genre ,Sequence labeling ,ComputingMethodologies_PATTERNRECOGNITION ,Unsupervised learning ,Artificial intelligence ,business ,Hidden Markov model ,computer ,Word (computer architecture) ,Natural language processing ,Sentence - Abstract
Part-Of-Speech (POS) Tagging is the process of assigning the words with their categories that best suits the definition of the word as well as the context of the sentence in which it is used. In this paper, we describe a machine learning algorithm for Myanmar Tagging using a corpus-based approach. In order to tag Myanmar language, we need to take part word segmentation, part of speech tagging using HMM and several Tag-sets. Thus, this paper deals with a combination of supervised and un-supervised learning which use pre-tagged and untagged corpus respectively. To assign to each word with the correct tag, we describe Supervised POS Tagging by using the class labels in terms of predictor features on manually tagged corpus and also describe Unsupervised POS Tagging for automatically training without using a manually tagged corpus. By experiments, the best configuration is investigated on different amount of training data and the accuracy is 97.56%.
- Published
- 2009
- Full Text
- View/download PDF