Back to Search
Start Over
An Attention-Based End-to-End Model for Multiple Text Lines Recognition in Japanese Historical Documents
- Source :
- ICDAR
- Publication Year :
- 2019
- Publisher :
- IEEE, 2019.
-
Abstract
- This paper presents an attention-based convolutional sequence to sequence (ACseq2seq) model for recognizing an input image of multiple text lines from Japanese historical documents without explicit segmentation of lines. The recognition system has three main parts: a feature extractor using Convolutional Neural Network (CNN) to extract a feature sequence from an input image; an encoder employing bidirectional Long Short-Term Memory (BLSTM) to encode the feature sequence; and a decoder using a unidirectional LSTM with the attention mechanism to generate the final target text based on the attended pertinent features. We also introduce a residual LSTM network between the attention vector and softmax layer in the decoder. The system can be trained end-to-end by a standard cross-entropy loss function. In the experiment, we evaluate the performance of the ACseq2seq model on the anomalously deformed Kana datasets in the PRMU contest. The results of the experiments show that our proposed model achieves higher recognition accuracy than the state-of-the-art recognition methods on the anomalously deformed Kana datasets.
- Subjects :
- Sequence
Computer science
business.industry
020206 networking & telecommunications
Pattern recognition
02 engineering and technology
Convolutional neural network
Softmax function
0202 electrical engineering, electronic engineering, information engineering
Feature (machine learning)
020201 artificial intelligence & image processing
Segmentation
Artificial intelligence
business
Encoder
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- 2019 International Conference on Document Analysis and Recognition (ICDAR)
- Accession number :
- edsair.doi...........494b1aa34fb14ac5ccc52182bb035a8d
- Full Text :
- https://doi.org/10.1109/icdar.2019.00106