Back to Search
Start Over
Thai person name recognition (PNR) using likelihood probability of tokenized words
- Source :
- 2017 International Electrical Engineering Congress (iEECON).
- Publication Year :
- 2017
- Publisher :
- IEEE, 2017.
-
Abstract
- Named Entity Recognition (NER) is very important in many natural language processing tasks, especially information extraction. The problem of NE extraction in Thai is much more complicated than English because Thai language lacks orthography and boundary indicator between words. In this paper, we presented a research work in the field of NER with the emphasis on person name recognition (PNR) in Thai text. Our proposed method consists of 4 steps. First, text is tokenized into a set of words. Second, a part-of-name probability is computed for each word using Odds with Laplace smoothing and Logistic function. Third, name candidates are selected based on the likelihood probability. Finally, the end point of name is identified using a set of rules and a drop rate threshold. We then evaluated out method using 1,700 online news from the InterBEST 2009 corpus. The results show that the proposed method yields average precision, recall, f-measure and accuracy at 75.21%, 98.10%, 85.15%, and 81.05% respectively.
- Subjects :
- Computer science
Speech recognition
020206 networking & telecommunications
02 engineering and technology
computer.software_genre
Field (computer science)
Odds
Set (abstract data type)
Information extraction
Named-entity recognition
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Additive smoothing
computer
Word (computer architecture)
Orthography
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2017 International Electrical Engineering Congress (iEECON)
- Accession number :
- edsair.doi...........b32ca37df433ce3bb1969ac5168d9142
- Full Text :
- https://doi.org/10.1109/ieecon.2017.8075816