1. Cross-domain document layout analysis using document style guide.
- Author
-
Wu, Xingjiao, Xiao, Luwei, Du, Xiangcheng, Zheng, Yingbin, Li, Xin, Ma, Tianlong, Jin, Cheng, and He, Liang
- Subjects
- *
DEEP learning , *COMPUTER vision , *IMAGE segmentation , *RESEARCH personnel - Abstract
Document layout analysis (DLA) is a crucial computer vision task that involves partitioning document images into high-level semantic regions such as figures, tables, backgrounds, and texts. Deep learning models for DLA typically require a large amount of labeled data, which can be expensive. Though some researchers use generated data for training, a substantial style gap exists between the generated and target data. Moreover, it is necessary to improve the quality of the generated samples to achieve better control. To address these challenges, we propose a cross-domain DLA framework called DL-DSG, which leverages document-style guidance. DL-DSG comprises three components: the document layout generator (DLG) responsible for generating document element locations, the document element decorator (DED) for filling the elements, and the document style discriminator (DSD) for style guidance. In addition to generating controlled documents, we also focus on bridging the gap between the generated and target samples. To this end, we introduce a novel strategy that transforms document style judgment into the document cross-domain style guidance component. We evaluate the effectiveness of DL-DSG on popular DLA datasets, including PubLayNet, DSSE-200, CS-150, and CDSSE, and demonstrate its superior performance. • We integrated cross-domain and quality assessment DLA by unsupervised method. • This method does not rely on annotated data. • It is the first unsupervised method for cross-domain DLA. • Our model can achieve significant results for cross-domain DLA. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF