Back to Search Start Over

Automatic Arabic Spelling Errors Detection and Correction Based on Confusion Matrix-Noisy Channel Hybrid System.

Authors :
Noaman, Hatem M.
Sarhan, Shahenda S.
Rashwan, M. A. A.
Source :
Egyptian Computer Science Journal; May2016, Vol. 40 Issue 2, p54-64, 11p
Publication Year :
2016

Abstract

Arabic spelling errors occur in different types of documents, such as handwritten by non experienced users, optical character recognition (OCR) documents and machine translated documents. Many researchers had tried to solve this dilemma but till now there is no a radical solution. This paper proposes a hybrid system based on the confusion matrix and the noisy channel spelling correction model to detect and correct automatically Arabic spelling errors. The proposed system is based on building a robust error confusion matrix using 163,452 pairs of spelling errors, and its corrected form extracted from Qatar Arabic Language Bank (QALP) and using this matrix with language model to generate list of candidates and choose the most appropriate candidate for given misspelled word. Comparing the proposed system results shows that system result outperform other systems results. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
11102586
Volume :
40
Issue :
2
Database :
Complementary Index
Journal :
Egyptian Computer Science Journal
Publication Type :
Academic Journal
Accession number :
116338909