Start Over

Arbitrary-shaped text detection with adaptive convolution and path enhancement pyramid network

Authors :: Bin Wei
Guodong Wang
Qi Cheng
Qian Dong
Source :: Multimedia Tools and Applications. 79:29225-29242
Publication Year :: 2020
Publisher :: Springer Science and Business Media LLC, 2020.
Abstract: Recently, scene text detection has become an active research field, which is an essential component of scene text reading. Especially, segmentation-based methods are commonly used, since the segmentation results can describe text of arbitrary shape. However, curve texts have a diversity of shapes, scales and orientations, which are difficult to locate, so the detector requires to adjust the local receptive fields size adaptively, which can aggregate multi-scale spatial information to accurately locate the curve text instance. Moreover, the low-level features are critical for localizing large text instances. When using Feature Pyramid Network (FPN) for multi-scale feature fusion, it will prevent the flow of accurate localization signals due to the long path from low-level to top-level. In order to solve these two problems, this paper proposes an Adaptive Convolution and Path Enhancement Pyramid Network (ACPEPNet), which can more accurately locate the text instances with arbitrary shapes. Firstly, an Adaptive Convolution Unit is introduced to improve the ability of backbone to aggregate multi-scale spatial information at the same stage. Specially, this unit is a lightweight component and without the cost of computations, based on this component we present a backbone network for text features extraction. Secondly, the original FPN structure is redesigned to build a short path from the low-level to top-level, in this way, we modify the path from one-way flow to two-way flow and add original features to the final stage of information fusion. Experiments on CTW1500, Total-Text, ICDAR 2015 and MSRA-TD500 validate the robustness of the proposed method. When there is no bells and whistles, this method achieves an F-measure of 80.8% without external training data on CTW1500.

Subjects :: Computer Networks and Communications
Computer science
business.industry
Detector
020207 software engineering
Pattern recognition
02 engineering and technology
Text detection
Hardware and Architecture
Robustness (computer science)
Pyramid
0202 electrical engineering, electronic engineering, information engineering
Media Technology
Segmentation
Artificial intelligence
business
Spatial analysis
Software

Details

ISSN :: 15737721 and 13807501
Volume :: 79
Database :: OpenAIRE
Journal :: Multimedia Tools and Applications
Accession number :: edsair.doi...........ec15bafcb561d2dc025ed4636c6971aa
Full Text :: https://doi.org/10.1007/s11042-020-09440-1

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Arbitrary-shaped text detection with adaptive convolution and path enhancement pyramid network

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Arbitrary-shaped text detection with adaptive convolution and path enhancement pyramid network

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources