Back to Search Start Over

Query-based Multi-Document Summarization by Clustering of Documents

Authors :
Prema Nedungadi
Gopal K. R. Naveen
Source :
Proceedings of the 2014 International Conference on Interdisciplinary Advances in Applied Computing.
Publication Year :
2014
Publisher :
ACM, 2014.

Abstract

Information Retrieval (IR) systems such as search engines retrieve a large set of documents, images and videos in response to a user query. Computational methods such as Automatic Text Summarization (ATS) reduce this information load enabling users to find information quickly without reading the original text. The challenges to ATS include both the time complexity and the accuracy of summarization. Our proposed Information Retrieval system consists of three different phases: Retrieval phase, Clustering phase and Summarization phase. In the Clustering phase, we extend the Potential-based Hierarchical Agglomerative (PHA) clustering method to a hybrid PHA-ClusteringGain-K-Means clustering approach. Our studies using the DUC 2002 dataset show an increase in both the efficiency and accuracy of clusters when compared to both the conventional Hierarchical Agglomerative Clustering (HAC) algorithm and PHA.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 2014 International Conference on Interdisciplinary Advances in Applied Computing
Accession number :
edsair.doi...........406df7aefd5b5e3dd49bedf866dfab56
Full Text :
https://doi.org/10.1145/2660859.2660972