Back to Search Start Over

A Machine Learning Approach for Named Entity Recognition in Classical Arabic Natural Language Processing.

Authors :
Salah, Ramzi
Mukred, Muaadh
Zakaria, Lailatul Qadri binti
Al-Yarimi, Fuad A. M.
Source :
KSII Transactions on Internet & Information Systems; Oct2024, Vol. 18 Issue 10, p2895-2919, 25p
Publication Year :
2024

Abstract

A key element of many Natural Language Processing (NLP) applications is Named Entity Recognition (NER). It involves categorizing and identifying text into separate categories, such as identifying a location or an individual's name. Arabic NER (ANER) is also utilized in numerous other Arabic NLP (ANLP) tasks, such as Machine Translation (MT), Question Answering (QA), and Information Extraction (IE). ANER systems can often be classified into three major groups: rule-based, Machine Learning (ML), and hybrid. This study focuses on examining ML-based ANER developments, particularly in the context of Classical Arabic, which presents unique challenges due to its complex morphological structure and limited linguistic resources. We propose a supervised approach that integrates word-level, morphological, and knowledge-based features to improve NER performance for Classical Arabic. Our method was evaluated on the CANERCorpus, a specialized dataset containing annotated texts from Classical Arabic literature. The Naive Bayes (NB) approach achieved an F-measure of 80%, with precision and recall levels at 86% and 75%, respectively. These results indicate a significant improvement over traditional methods, particularly in dealing with the intricate structure of Classical Arabic. The study highlights the potential of ML in overcoming the challenges of ANER and provides directions for further research in this domain. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
19767277
Volume :
18
Issue :
10
Database :
Supplemental Index
Journal :
KSII Transactions on Internet & Information Systems
Publication Type :
Academic Journal
Accession number :
180731024
Full Text :
https://doi.org/10.3837/tiis.2024.10.005