Back to Search
Start Over
Cnosso, a novel method for business document automation based on open information extraction.
- Source :
-
Expert Systems with Applications . Jul2024, Vol. 245, pN.PAG-N.PAG. 1p. - Publication Year :
- 2024
-
Abstract
- The state-of-the-art in automated processing of unstructured business documents has evolved from manual labor to advanced AI systems in the span of mere decades. Such systems involve learning techniques, rule or clause sets, neural models – either used alone or in combination – for the extraction to work. As an example, rule-based processes operate on a perceived layout or positioning of the information, whereas model-based frameworks adopt a semantic, and often uninspectable, approach. Verb-Based Semantic Role Labeling (VBSRL) is a novel system presented in a former paper that uses a hybrid foundation to inform the extraction phase via a set of rules modeling natural language. We propose a new VBSRL-based document processing method, aided by valuable and innovative architectural choices, which has been implemented for the Italian language and experimented upon with promising results. Even in its infancy, in fact, the first implementation of this system shows better results than comparable IE solutions, obtaining an aggregate, average F-measure of nearly 79%. • Automating business document analysis is crucial and time consuming in enterprises. • Classification and information extraction for unstructured documents are hard tasks. • Document processing method via pre-processing, normalization and post-processing. • Information Extraction as Conceptual Dependency Theory plus Semantic Role Labeling. • Performances on real case scenario show better results than comparable IE solutions. [ABSTRACT FROM AUTHOR]
- Subjects :
- *DATA mining
*AUTOMATION
*ITALIAN language
*MANUAL labor
*NATURAL languages
Subjects
Details
- Language :
- English
- ISSN :
- 09574174
- Volume :
- 245
- Database :
- Academic Search Index
- Journal :
- Expert Systems with Applications
- Publication Type :
- Academic Journal
- Accession number :
- 176151956
- Full Text :
- https://doi.org/10.1016/j.eswa.2023.123038