Back to Search Start Over

Thai person name recognition (PNR) using likelihood probability of tokenized words

Authors :
Tiranee Achalakul
Santitham Prom-on
Nareerat Saetiew
Source :
2017 International Electrical Engineering Congress (iEECON).
Publication Year :
2017
Publisher :
IEEE, 2017.

Abstract

Named Entity Recognition (NER) is very important in many natural language processing tasks, especially information extraction. The problem of NE extraction in Thai is much more complicated than English because Thai language lacks orthography and boundary indicator between words. In this paper, we presented a research work in the field of NER with the emphasis on person name recognition (PNR) in Thai text. Our proposed method consists of 4 steps. First, text is tokenized into a set of words. Second, a part-of-name probability is computed for each word using Odds with Laplace smoothing and Logistic function. Third, name candidates are selected based on the likelihood probability. Finally, the end point of name is identified using a set of rules and a drop rate threshold. We then evaluated out method using 1,700 online news from the InterBEST 2009 corpus. The results show that the proposed method yields average precision, recall, f-measure and accuracy at 75.21%, 98.10%, 85.15%, and 81.05% respectively.

Details

Database :
OpenAIRE
Journal :
2017 International Electrical Engineering Congress (iEECON)
Accession number :
edsair.doi...........b32ca37df433ce3bb1969ac5168d9142
Full Text :
https://doi.org/10.1109/ieecon.2017.8075816