Back to Search Start Over

High Accuracy Farsi Language Character Segmentation and Recognition

Authors :
Pantea Kiaei
Hoda Mohammadzade
Mojan Javaheripi
Source :
2019 27th Iranian Conference on Electrical Engineering (ICEE).
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Despite many advances in optical character recognition in general, there are still serious challenges remaining in recognizing Farsi text. The main reason is the cursive nature of the letters in written Farsi, i.e., depending on the position of a letter within a word, it might join to its neighboring letters, which consequently changes the shape of the character. As a result, each letter can have up to four different character shapes. In addition to the problem of segmenting the characters, the increased number of characters makes the recognition task even more challenging. This paper introduces a complete framework for character recognition, including a method for segmenting the characters and one for classifying the resulting separated characters. Character segmentation is performed using a new sliding-window algorithm with a high accuracy rate of 98.23%. With a total of 32 Farsi letters resulting in 114 character shapes, an almost perfect character recognition rate of 99.94% is achieved using the proposed Fisher characters method. The final system, including segmentation and recognition modules, achieves a recognition rate of 98.17% and is robust against the scale and rotation of the image, and the font size of the written text.

Details

Database :
OpenAIRE
Journal :
2019 27th Iranian Conference on Electrical Engineering (ICEE)
Accession number :
edsair.doi...........077edfa2ca28e7ceba06f1bebd2eb786