1. MuSIC: A Novel Multi-Scale Deep Neural Framework for Script Identification in the Wild
- Author
-
Tauseef Khan, Md. Saif, and Ayatullah Faruk Mollah
- Subjects
Script identification ,MuSIC ,multi-scale ,convolutional neural network ,multi-script ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Script identification in digital images is crucial for automated text reading in multilingual contexts. Developing a robust script-identifier in complex environments is challenging due to prevalence of mobiles and digitized documents. This paper presents a novel multi-scale image classification framework named as MuSIC, for identifying scripts in documents, scenes, and video texts. At first, multiple CNNs simultaneously process scaled maps of input image to produce scale-wise predictions. Weight computation module assigns unique weight to scaled map by measuring the deviation in number of pixels of the object area compared to original image. In weight-aware decision mechanism, the bag-of-prediction scores for possible output classes are updated by aggregating scale weights when CNN prediction matches the class. Finally, class with the highest score is selected as the final output class for the script of the image. Key features of MuSIC include scale-wise weight computation followed by weight-aware decision mechanism, resulting in accurate outcomes than conventional majority voting in multi-scale image classification. MuSIC is evaluated on three public datasets: AUTNT(s), CVSI-2015, and ICDAR 2019-MLT, which includes Indic, non-Indic, East Asian, and Indian regional scripts across documents, scenes, and videos. The model achieves classification accuracies of 98.28%, 96.18%, and 98.03% for three subsets of AUTNT viz. AUTNT-document, AUTNT-scene and AUTNT-mixed, besides, 95.92% and 93.83% for CVSI-2015 and ICDAR 2019-MLT, respectively. Results demonstrate robustness of MuSIC across multiple assessments. The source code, usage guidelines, and initial benchmark performance of MuSIC are available at https://github.com/iilabau/MuSIC for academic, research and non-commercial purposes.
- Published
- 2024
- Full Text
- View/download PDF