Back to Search
Start Over
Efficient skew detection and correction in scanned document images through clustering of probabilistic hough transforms
- Source :
- Pattern Recognition Letters. 152:93-99
- Publication Year :
- 2021
- Publisher :
- Elsevier BV, 2021.
-
Abstract
- Documents scanning is still one of the widely used documents digitization steps; however, skew in scanned documents is inevitable. If this skew is not corrected, the extraction of region/s of interest (RoI) and further processing like; detection and classification on such RoI becomes difficult. It has been shown that skew detection and correction significantly improve Optical Character Recognition (OCR) systems’ accuracy. This paper introduces a novel, robust and straightforward skew detection method for scanned documents, which uses Probabilistic Hough Transformation (PHT) for line detection in a first step and clusters the lines in a second step based on parallelism. The cluster with maximum parallel lines represents the expected skewed lines. The proposed method is tested on real scanned images taken from the Document Image Skew Estimation Contest (DISEC’13), Pashto, and Tobacco800 datasets. The proposed method performs well both in terms of accuracy and efficiency. It is efficient and robust to noise. Furthermore, we show that it also works on Arabic and Latin scripts.
- Subjects :
- business.industry
Computer science
Probabilistic logic
Skew
Pattern recognition
Optical character recognition
Parallel
computer.software_genre
Hough transform
law.invention
Artificial Intelligence
law
Signal Processing
Line (geometry)
Computer Vision and Pattern Recognition
Noise (video)
Artificial intelligence
business
Cluster analysis
computer
Software
Subjects
Details
- ISSN :
- 01678655
- Volume :
- 152
- Database :
- OpenAIRE
- Journal :
- Pattern Recognition Letters
- Accession number :
- edsair.doi...........276bfc1dd024b23ea348da8a0d3ac332