1. Handwriting word spotting in the space of difference between representations using vision transformers.
- Author
-
Mhiri, Mohamed, Hamdan, Mohammed, and Cheriet, Mohamed
- Subjects
- *
TRANSFORMER models , *WORD recognition , *SPACE perception , *HANDWRITING - Abstract
Word spotting in handwritten documents is challenging due to the high intra-class and inter-class variability of handwritten forms. This paper addresses the word spotting problem in the segmentation and the training scenarios. Overall, this paper makes the following three contributions: (1) a new word text representation, called the Pyramid of Bidirectional Character Sequences (PBCS), which can solve both the word spotting problem and the word recognition problem. The use of the PBCS representation allows trained models to identify the character subsequences shared by words. Thus, words that are not seen during training can be represented and spotted. In addition, the PBCS representation encodes word texts redundantly, allowing for word discrimination. (2) A binary classification modeling of the word spotting problem in the difference space between representations, where spotting of non-vocabulary words is more efficient. Finally, (3) a new deep neural network architecture that combines the strengths of convolutional layers and transformers. We evaluated our solution on IAM and RIEMS datasets and showed that it outperforms recent state-of-the-art methods in the query-by-example scenario. • The Pyramid of Bidirectional Character Sequences (PBCS) word text representation. • Word spotting task as a binary classification problem. • Combing the strengths of convolutional layers and transformers. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF