1. taka at the FinSBD-3 task: Tables and Figures Extraction using Object Detection Techniques
- Author
-
Tien Dung Le
- Subjects
Structure (mathematical logic) ,Information retrieval ,Computer science ,Image processing ,Segmentation ,Context (language use) ,Object (computer science) ,Sensory cue ,Object detection ,Task (project management) - Abstract
FinSBD-3 is a shared task organized in the context of the 1st workshop on Financial Technology on the Web. The task focuses on extracting the entire structure of noisy PDF financial documents that include 1) sentences, lists, items, and organization of lists and items; 2) figures and tables; 3) headers and footers. This paper describes the approach that allows us to extract the figures and tables using their visual cues. We applied the object segmentation techniques in image processing to detect the location of figures and tables in the PDF files. A post-processing method is then executed in order to find exact content. The result shows the potential of this approach.
- Published
- 2021
- Full Text
- View/download PDF