1. Coarse-to-fine document localization in natural scene image with regional attention and recursive corner refinement
- Author
-
Zhu, Anna, Zhang, Chen, Li, Zhi, and Xiong, Shengwu
- Abstract
Document localization is a promising step for document-based optical character recognition. This task gains difficulty when documents are located in complex natural scene images. In this paper, we propose a coarse-to-fine document localization approach to detect the four corner points of the document in natural scene images. In the first stage, the four corners are roughly predicted through a deep neural networks-based Joint Corner Detector (JCD) with an attention mechanism, which roughly localize the document region via an attentional map. As a key to produce accurate inference of corners, the JCD module suppresses the interference from background in convolutional features substantially. In the second stage, a corner-specific refiner module is designed to refine the previously predicted corners. Considering the different characteristics of the four document corners, the patches cropped around the predicted corners are input into four different corner-specified CNN models, to search the accurate corner locations recursively. Three datasets (ICDAR 2015 SmartDoc competition 1 dataset, SEECS-NUSF dataset and a self-collected dataset) are used to evaluate the performance of our method. The experimental results demonstrate the superiority of the proposed method in localizing the document in natural images, especially in those with complex background. Compared with the state-of-the-art works, our method outperforms most of them.
- Published
- 2024
- Full Text
- View/download PDF