Back to Search Start Over

Interactive Document Indexing Method Based on Explicit Semantic Analysis

Authors :
Andrzej Janusz
Adam Krasuski
Wojciech Świeboda
Hung Son Nguyen
Source :
Rough Sets and Current Trends in Computing ISBN: 9783642321146, RSCTC
Publication Year :
2012
Publisher :
Springer Berlin Heidelberg, 2012.

Abstract

In this article we propose a general framework incorporating semantic indexing and search of texts within scientific document repositories. In our approach, a semantic interpreter, which can be seen as a tool for automatic tagging of textual data, is interactively updated based on feedback from the users, in order to improve quality of the tags that it produces. In our experiments, we index our document corpus using the Explicit Semantic Analysis (ESA) method. In this algorithm, an external knowledge base is used to measure relatedness between words and concepts, and those assessments are utilized to assign meaningful concepts to given texts. In the paper, we explain how the weights expressing relations between particular words and concepts can be improved by interaction with users or by employment of expert knowledge. We also present some results of experiments on a document corpus acquired from the PubMed Central repository to show feasibility of our approach.

Details

ISBN :
978-3-642-32114-6
ISBNs :
9783642321146
Database :
OpenAIRE
Journal :
Rough Sets and Current Trends in Computing ISBN: 9783642321146, RSCTC
Accession number :
edsair.doi...........998ea4dc0979e11a2f7af50b34eda622