Back to Search Start Over

A comparison of deep transfer learning backbone architecture techniques for printed text detection of different font styles from unstructured documents.

Authors :
Mahadevkar, Supriya
Patil, Shruti
Kotecha, Ketan
Abraham, Ajith
Source :
PeerJ Computer Science; Feb2024, p1-25, 25p
Publication Year :
2024

Abstract

Object detection methods based on deep learning have been used in a variety of sectors including banking, healthcare, e-governance, and academia. In recent years, there has been a lot of attention paid to research endeavors made towards text detection and recognition from different scenesor images of unstructured document processing. The article's novelty lies in the detailed discussion and implementation of the various transfer learning-based different backbone architectures for printed text recognition. In this research article, the authors compared the ResNet50, ResNet50V2, ResNet152V2, Inception, Xception, and VGG19 backbone architectures with preprocessing techniques as data resizing, normalization, and noise removal on a standard OCR Kaggle dataset. Further, the top three backbone architectures selected based on the accuracy achieved and then hyper parameter tunning has been performed to achieve more accurate results. Xception performed well compared with the ResNet, Inception, VGG19, MobileNet architectures by achieving high evaluation scores with accuracy (98.90%) and min loss (0.19). As per existing research in this domain, until now, transfer learningbased backbone architectures that have been used on printed or handwritten data recognition are not well represented in literature. We split the total dataset into 80 percent for training and 20 percent for testing purpose and then into different backbone architecture models with the same number of epochs, and found that the Xception architecture achieved higher accuracy than the others. In addition, the ResNet50V2 model gave us higher accuracy (96.92%) than the ResNet152V2 model (96.34%). [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
23765992
Database :
Complementary Index
Journal :
PeerJ Computer Science
Publication Type :
Academic Journal
Accession number :
175889748
Full Text :
https://doi.org/10.7717/peerj-cs.1769