Back to Search Start Over

An Attention-Based End-to-End Model for Multiple Text Lines Recognition in Japanese Historical Documents

Authors :
Masaki Nakagawa
Cuong Tuan Nguyen
Nam Tuan Ly
Source :
ICDAR
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

This paper presents an attention-based convolutional sequence to sequence (ACseq2seq) model for recognizing an input image of multiple text lines from Japanese historical documents without explicit segmentation of lines. The recognition system has three main parts: a feature extractor using Convolutional Neural Network (CNN) to extract a feature sequence from an input image; an encoder employing bidirectional Long Short-Term Memory (BLSTM) to encode the feature sequence; and a decoder using a unidirectional LSTM with the attention mechanism to generate the final target text based on the attended pertinent features. We also introduce a residual LSTM network between the attention vector and softmax layer in the decoder. The system can be trained end-to-end by a standard cross-entropy loss function. In the experiment, we evaluate the performance of the ACseq2seq model on the anomalously deformed Kana datasets in the PRMU contest. The results of the experiments show that our proposed model achieves higher recognition accuracy than the state-of-the-art recognition methods on the anomalously deformed Kana datasets.

Details

Database :
OpenAIRE
Journal :
2019 International Conference on Document Analysis and Recognition (ICDAR)
Accession number :
edsair.doi...........494b1aa34fb14ac5ccc52182bb035a8d
Full Text :
https://doi.org/10.1109/icdar.2019.00106