Start Over

Key–Value Pair Identification from Tables Using Multimodal Learning.

Authors :: Chu, Jung Soo
Pyo, Bryan
Parth, Vik
Hussein, Ahmed
Wang, Patrick
Source :: International Journal of Pattern Recognition & Artificial Intelligence; Jun2023, Vol. 37 Issue 7, p1-15, 15p
Publication Year :: 2023
Abstract: Computer vision and optical character recognition techniques have rapidly advanced in order to accurately capture text and other features from paper documents. While state-of-the-art tools in these fields now yield high accuracy, analyzing their outputs requires more research. Since tables are common in such documents, a new pipeline, based on multimodal learning, is proposed to better extract key–value pairs from tables. Its performance is evaluated with a synthetically generated dataset with randomly generated tables and a dataset of mechanical part documents provided by SiliconExpert Technologies. Its performance is also compared with another state-of-the-art model built for similar tasks, LayoutLM. The proposed pipeline provides a fully automated, end-to-end scalable solution, beginning with image processing and computer vision components to a machine learning model that uses data from optical character recognition and natural language processing to make the final decisions. In the best configuration, the pipeline achieved a 96.26% accuracy on a large, synthetically generated training and test set. When comparing the proposed pipeline with LayoutLM, the proposed pipeline performed similarly on the synthetic dataset and better on the real dataset. These results show the potential of the multimodal approach in extracting key–value pairs from tables from real paper documents. [ABSTRACT FROM AUTHOR]

Subjects :: MACHINE learning
NATURAL language processing
OPTICAL character recognition
COMPUTER vision
IMAGE processing

Details

Language :: English
ISSN :: 02180014
Volume :: 37
Issue :: 7
Database :: Complementary Index
Journal :: International Journal of Pattern Recognition & Artificial Intelligence
Publication Type :: Academic Journal
Accession number :: 164729090
Full Text :: https://doi.org/10.1142/S0218001423520092