Back to Search
Start Over
Moroccan Arabic vocabulary generation using a rule-based approach.
- Source :
- Journal of King Saud University - Computer & Information Sciences; Nov2022:Part A, Vol. 34 Issue 10, p8538-8548, 11p
- Publication Year :
- 2022
-
Abstract
- NLP resources play a crucial role in the building of many NLP applications. The importance of these resources depends not only on their size and coverage but also on the richness and the precision of the annotated information they provide. In the case of resource-scarce languages such as Moroccan Arabic, the building of NLP applications is limited due to the lack of these resources. To overcome this problem, we follow a rule-based approach to generate a Moroccan morphological vocabulary (MORV) which constitutes the first step addressing the problem of Moroccan morphological generation. MORV is designed and implemented based on two main components: On one hand, an MA lexicon and a list of fully annotated affixes and clitics that we have created specifically to ensure the generation process. On the other hand, a set of rules covering the concatenation and the orthographic adjustments of the generated words. Moreover, given a base form, MORV outputs more than 4.5 M Moroccan words with rich morphological features such as tense, gender, number, state, etc. We tested the coverage of MORV on texts collected from Moroccan social media and realized that it reaches a vocabulary coverage of 84% and a precision of 94%. This system is a benefit for building other NLP applications such as spell checking, morphological analysis, and machine translation. [ABSTRACT FROM AUTHOR]
- Subjects :
- MACHINE translating
VOCABULARY
NATURAL language processing
ARABIC language
Subjects
Details
- Language :
- English
- ISSN :
- 13191578
- Volume :
- 34
- Issue :
- 10
- Database :
- Supplemental Index
- Journal :
- Journal of King Saud University - Computer & Information Sciences
- Publication Type :
- Academic Journal
- Accession number :
- 160169854
- Full Text :
- https://doi.org/10.1016/j.jksuci.2021.02.013