Back to Search Start Over

A New Biomedical Text Summarization Method Based on Sentence Clustering and Frequent Itemsets Mining

Authors :
Hacene Belhadef
Mustapha Bouakkaz
Oussama Rouane
Source :
Smart Innovation, Systems and Technologies ISBN: 9783030210045
Publication Year :
2019
Publisher :
Springer International Publishing, 2019.

Abstract

In this paper, we combined sentence clustering and frequent itemsets mining to build a single biomedical text summarization method. Biomedical documents are represented as a sets of UMLS concepts. Very generic concepts are discarded. The vector space model is used to represent sentences. The K-means clustering algorithm is applied to cluster semantically similar sentences. The most frequent itemsets are extracted among the global cluster. The generated frequent itemsets are used to calculate the score of sentences. The top N highly scoring sentences are selected to represent the final summary. The method is evaluated against three summarizers: TextRank, SweSum and Itemset based summarizer on a 50 randomly selected biomedical papers from the BioMed Central database full text. The evaluation process consists of comparing the generated summaries with the abstracts of these papers using the ROUGE toolkit. Our method achieved good results, it ranked first in ROUGE-1 and ROUGE-2 measures with an improvement of \(\sim \)3% than the Itemset based summarizer and it ranked second in ROUGE-SU4 measure with a diminution of \(\sim \)1% always against the Itemset based summarizer.

Details

ISBN :
978-3-030-21004-5
ISBNs :
9783030210045
Database :
OpenAIRE
Journal :
Smart Innovation, Systems and Technologies ISBN: 9783030210045
Accession number :
edsair.doi...........8d29e302d93d9050c0278e7b3494913c