101. Persian Named Entity Recognition
- Author
-
Newton Howard, Kia Dashtipour, Abdulrahman Algarafi, Amir Hussain, Ahsan Adeel, Mandar Gogate, Howard, N, Wang, Y, Hussain, A, Widrow, B, and Zadeh, LA
- Subjects
Computer science ,media_common.quotation_subject ,02 engineering and technology ,computer.software_genre ,Handheld computers ,Tools ,03 medical and health sciences ,Entity linking ,0302 clinical medicine ,Named-entity recognition ,0202 electrical engineering, electronic engineering, information engineering ,030212 general & internal medicine ,Affective computing ,Persian ,media_common ,Text recognition ,Grammar ,Support vector machines ,business.industry ,Persian grammar ,Natural language processing ,Sentiment analysis ,language.human_language ,Information extraction ,Dictionaries ,language ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer - Abstract
Named Entity Recognition (NER) is an important natural language processing (NLP) tool for information extraction and retrieval from unstructured texts such as newspapers, blogs and emails. NER involves processing unstructured text for classification of words or expressions into relevant categories. In literature, NER has been developed for various languages but limited work has been conducted to develop NER for Persian text. This is due to limited resources (such as corpus, lexicons etc.) and tools for Persian named entities. In this paper, a novel scalable system for Persian Named Entity Recognition (PNER) is presented. The proposed PNER can recognize and extract three most important named entities in Persian script: the person name, location and date. The proposed PNER has been developed by combining a grammatical rule-based approach with machine learning. The proposed framework has integrated dictionaries of Persian named entities, Persian grammar rules and a Support Vector Machine (SVM). The performance evaluation of PNER in terms of precision, recall and f-measure has achieved comparable results with the state-of-the-art NER frameworks in other languages.
- Published
- 2017