1. Deep Sparse Auto-Encoder Features Learning for Arabic Text Recognition
- Author
-
Adel M. Alimi, Maroua Tounsi, Amir Hussain, and Najoua Rahal
- Subjects
feature learning ,Arabic text recognition ,Vocabulary ,General Computer Science ,Computer science ,Arabic ,media_common.quotation_subject ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,0507 social and economic geography ,Arabic text recognition, feature learning, bag of features, sparse auto-encoder, hidden Markov models ,02 engineering and technology ,computer.software_genre ,Discriminative model ,Classifier (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,sparse auto-encoder ,Hidden Markov model ,Cursive ,media_common ,hidden Markov models ,business.industry ,Deep learning ,05 social sciences ,General Engineering ,Autoencoder ,bag of features ,language.human_language ,ComputingMethodologies_PATTERNRECOGNITION ,Pattern recognition (psychology) ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,language ,020201 artificial intelligence & image processing ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Artificial intelligence ,business ,lcsh:TK1-9971 ,050703 geography ,computer ,MNIST database ,Natural language processing - Abstract
One of the most recent challenging issues of pattern recognition and artificial intelligence is Arabic text recognition. This research topic is still a pervasive and unaddressed research field, because of several factors. Complications arise due to the cursive nature of the Arabic writing, character similarities, unlimited vocabulary, use of multi-size and mixed-fonts, etc. To handle these challenges, an automatic Arabic text recognition requires building a robust system by computing discriminative features and applying a rigorous classifier together to achieve an improved performance. In this work, we introduce a new deep learning based system that recognizes Arabic text contained in images. We propose a novel hybrid network, combining a Bag-of-Feature (BoF) framework for feature extraction based on a deep Sparse Auto-Encoder (SAE), and Hidden Markov Models (HMMs), for sequence recognition. Our proposed system, termed BoF-deep SAE-HMM, is tested on four datasets, namely the printed Arabic line images Printed KHATT (P-KHATT), the benchmark printed word images Arabic Printed Text Image (APTI), the benchmark handwritten Arabic word images IFN/ENIT, and the benchmark handwritten digits images Modified National Institute of Standards and Technology (MNIST).
- Published
- 2021