Back to Search Start Over

Boosting text segmentation via progressive classification

Authors :
Riccardo Ortale
Eugenio Cesario
Francesco Folino
Antonio Locane
Giuseppe Manco
Source :
Knowledge and Information Systems 15 (2008): 285–320. doi:10.1007/s10115-007-0085-3, info:cnr-pdr/source/autori:Cesario Eugenio; Folino Francesco Paolo; Locane Antonio; Manco Giuseppe; Ortale Riccardo/titolo:Boosting Text Segmentation via Progressive Classification/doi:10.1007%2Fs10115-007-0085-3/rivista:Knowledge and Information Systems/anno:2008/pagina_da:285/pagina_a:320/intervallo_pagine:285–320/volume:15
Publication Year :
2007
Publisher :
Springer Science and Business Media LLC, 2007.

Abstract

A novel approach for reconciling tuples stored as free text into an existing attribute schema is proposed. The basic idea is to subject the available text to progressive classification, i.e., a multi-stage classification scheme where, at each intermediate stage, a classifier is learnt that analyzes the textual fragments not reconciled at the end of the previous steps. Classifica- tion is accomplished by an ad hoc exploitation of traditional association mining algorithms, and is supported by a data transformation scheme which takes advantage of domain-specific dictionaries/ontologies. A key feature is the capability of progressively enriching the avail- able ontology with the results of the previous stages of classification, thus significantly improving the overall classification accuracy. An extensive experimental evaluation shows the effectiveness of our approach.

Details

ISSN :
02193116 and 02191377
Volume :
15
Database :
OpenAIRE
Journal :
Knowledge and Information Systems
Accession number :
edsair.doi.dedup.....f67c61aa1d490142ac0bf38f203ca1bb