Back to Search Start Over

CycleAugment: Efficient data augmentation strategy for handwritten text recognition in historical document images

Authors :
Sarayut Gonwirat
Olarik Surinta
Source :
Engineering and Applied Science Research, Vol 49, Iss 4, Pp 505-520 (2022)
Publication Year :
2022
Publisher :
Khon Kaen University, 2022.

Abstract

Predicting the sequence pattern of the handwritten text images is a challenging problem due to various writing styles, insufficient training data, and also background noise appearing in the text images. The architecture of the combination between convolutional neural network (CNN) and recurrent neural network (RNN), called CRNN architecture, is the most successful sequence learning method for handwritten text recognition systems. For handwritten text recognition in historical Thai document images, we first trained nine different CRNN architectures with both training from scratch and transfer learning techniques to find out the most powerful technique. We discovered that the transfer learning technique does not significantly outperform scratch learning. Second, we examined training the CRNN model by applying the basic transformation data augmentation techniques: shifting, rotation, and shearing. Indeed, the data augmentation techniques provided more accurate performance than without applying data augmentation techniques. However, it did not show significant results. The original training strategy aimed to find the global minima value and not always solve the overfitting problems. Third, we proposed a cyclical data augmentation strategy, called CycleAugment, to discover many local minima values and prevent overfitting. In each cycle, it rapidly decreased the training loss to reach the local minima. The CycleAugment strategy allowed the CRNN model to learn the input images with and without applying data augmentation techniques to learn from many input patterns. Hence, the CycleAugment strategy consistently achieved the best performance when compared with other strategies. Finally, we prevented image distortion by applying a simple technique to the short word images and achieved better performance on the historical Thai document image dataset.

Details

Language :
English
ISSN :
25396161 and 25396218
Volume :
49
Issue :
4
Database :
Directory of Open Access Journals
Journal :
Engineering and Applied Science Research
Publication Type :
Academic Journal
Accession number :
edsdoj.b11f04841e4f4463a61998f0ceeb6786
Document Type :
article