Back to Search Start Over

Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

Authors :
Regina Nogueira
Anália Lourenço
André L. Santos
Universidade do Minho
Source :
Advances in Distributed Computing and Artificial Intelligence Journal, Vol 1, Iss 1, Pp 1-8 (2013), Advances in Distributed Computing and Artificial Intelligence Journal, Vol 1, Iss 1, Pp 1-8 (2012), Repositório Científico de Acesso Aberto de Portugal, Repositório Científico de Acesso Aberto de Portugal (RCAAP), instacron:RCAAP
Publication Year :
2013
Publisher :
Ediciones Universidad de Salamanca, 2013.

Abstract

Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

Details

Language :
English
ISSN :
22552863
Volume :
1
Issue :
1
Database :
OpenAIRE
Journal :
Advances in Distributed Computing and Artificial Intelligence Journal
Accession number :
edsair.doi.dedup.....50312924da885d320bc14f6129b1a5cb