Back to Search Start Over

Extractive text summarization system to aid data extraction from full text in systematic review development.

Authors :
Bui, Duy Duc An
Del Fiol, Guilherme
Hurdle, John F.
Jonnalagadda, Siddhartha
Duc An Bui, Duy
Source :
Journal of Biomedical Informatics; Dec2016, Vol. 64, p265-272, 8p
Publication Year :
2016

Abstract

<bold>Objectives: </bold>Extracting data from publication reports is a standard process in systematic review (SR) development. However, the data extraction process still relies too much on manual effort which is slow, costly, and subject to human error. In this study, we developed a text summarization system aimed at enhancing productivity and reducing errors in the traditional data extraction process.<bold>Methods: </bold>We developed a computer system that used machine learning and natural language processing approaches to automatically generate summaries of full-text scientific publications. The summaries at the sentence and fragment levels were evaluated in finding common clinical SR data elements such as sample size, group size, and PICO values. We compared the computer-generated summaries with human written summaries (title and abstract) in terms of the presence of necessary information for the data extraction as presented in the Cochrane review's study characteristics tables.<bold>Results: </bold>At the sentence level, the computer-generated summaries covered more information than humans do for systematic reviews (recall 91.2% vs. 83.8%, p<0.001). They also had a better density of relevant sentences (precision 59% vs. 39%, p<0.001). At the fragment level, the ensemble approach combining rule-based, concept mapping, and dictionary-based methods performed better than individual methods alone, achieving an 84.7% F-measure.<bold>Conclusion: </bold>Computer-generated summaries are potential alternative information sources for data extraction in systematic review development. Machine learning and natural language processing are promising approaches to the development of such an extractive summarization system. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15320464
Volume :
64
Database :
Supplemental Index
Journal :
Journal of Biomedical Informatics
Publication Type :
Academic Journal
Accession number :
119848541
Full Text :
https://doi.org/10.1016/j.jbi.2016.10.014