Back to Search
Start Over
PIEED: Position information enhanced encoder-decoder framework for scene text recognition.
- Source :
- Applied Intelligence; Oct2021, Vol. 51 Issue 10, p6698-6707, 10p
- Publication Year :
- 2021
-
Abstract
- Scene text recognition (STR) technology has a rapid development with the rise of deep learning. Recently, the encoder-decoder framework based on attention mechanism is widely used in STR for better recognition. However, the commonly used Long Short Term Memory (LSTM) network in the framework tends to ignore certain position or visual information. To address this problem, we propose a Position Information Enhanced Encoder-Decoder (PIEED) framework for scene text recognition, in which an addition position information enhancement (PIE) module is proposed to compensate the shortage of the LSTM network. Our module tends to retain more position information in the feature sequence, as well as the context information extracted by the LSTM network, which is helpful to improve the recognition accuracy of the text without context. Besides that, our fusion decoder can make full use of the output of the proposed module and the LSTM network, so as to independently learn and preserve useful features, which is helpful to improve the recognition accuracy while not increase the number of arguments. Our overall framework can be trained end-to-end only using images and ground truth. Extensive experiments on several benchmark datasets demonstrate that our proposed framework surpass state-of-the-art ones on both regular and irregular text recognition. [ABSTRACT FROM AUTHOR]
- Subjects :
- TEXT recognition
LONG-term memory
SHORT-term memory
DEEP learning
Subjects
Details
- Language :
- English
- ISSN :
- 0924669X
- Volume :
- 51
- Issue :
- 10
- Database :
- Complementary Index
- Journal :
- Applied Intelligence
- Publication Type :
- Academic Journal
- Accession number :
- 152275109
- Full Text :
- https://doi.org/10.1007/s10489-021-02219-3