Back to Search Start Over

PIEED: Position information enhanced encoder-decoder framework for scene text recognition.

Authors :
Ma, Xitao
He, Kai
Zhang, Dazhuang
Li, Dashuang
Source :
Applied Intelligence; Oct2021, Vol. 51 Issue 10, p6698-6707, 10p
Publication Year :
2021

Abstract

Scene text recognition (STR) technology has a rapid development with the rise of deep learning. Recently, the encoder-decoder framework based on attention mechanism is widely used in STR for better recognition. However, the commonly used Long Short Term Memory (LSTM) network in the framework tends to ignore certain position or visual information. To address this problem, we propose a Position Information Enhanced Encoder-Decoder (PIEED) framework for scene text recognition, in which an addition position information enhancement (PIE) module is proposed to compensate the shortage of the LSTM network. Our module tends to retain more position information in the feature sequence, as well as the context information extracted by the LSTM network, which is helpful to improve the recognition accuracy of the text without context. Besides that, our fusion decoder can make full use of the output of the proposed module and the LSTM network, so as to independently learn and preserve useful features, which is helpful to improve the recognition accuracy while not increase the number of arguments. Our overall framework can be trained end-to-end only using images and ground truth. Extensive experiments on several benchmark datasets demonstrate that our proposed framework surpass state-of-the-art ones on both regular and irregular text recognition. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0924669X
Volume :
51
Issue :
10
Database :
Complementary Index
Journal :
Applied Intelligence
Publication Type :
Academic Journal
Accession number :
152275109
Full Text :
https://doi.org/10.1007/s10489-021-02219-3