Back to Search Start Over

CWL+Research Object == Complete Provenance

Authors :
Khan, Farah Zaib
Soiland-Reyes, Stian
Lonie, Andrew
Sinnott, Richard
Source :
Khan, F Z, Soiland-Reyes, S, Lonie, A & Sinnott, R 2017, ' CWL+Research Object == Complete Provenance ' Bioinformatics Open Source Conference (BOSC) 2017, Prague, Czech Republic, 22/07/17-23/07/17, . DOI: 10.7490/f1000research.1114781.1
Publication Year :
2017

Abstract

The term Provenance is referred to as ‘The beginning of something’s existence; something’s origin’ Or ‘A record of ownership of a work of art or an antique, used as a guide to authenticity or quality’. Provenance tracking is crucial in scientific studies where workflows have emerged as an exemplar approach to mechanize data-intensive analyses. Gil et al. analyze challenges of scientific workflows and concluded that formally specified workflow helps‘accelerate the rate of scientific process’ and facilitates others to reproduce the given experiment provided that provenance of end-to-end process at every level is captured.We have implemented exemplar GATK variant calling workflow using three approaches to workflow definition namely Galaxy, CWL and Cpipe to identify assumptions implicit in these approaches. These assumptions lead to limited or no understanding of reproducibility requirements due to lack of documentation and comprehensive provenance tracking and resulted in identification of provenance information crucial for genomic workflows.CWL provides a declarative approach to workflow declaration making minimal assumptions about precise software environment, base software dependencies, configuration settings, alteration of parameters and software versions. It aims to provide an open source extensible standard to build flexible and customized workflows including intricate details of every process. It facilitates capture of information by supporting declaration of requirements, `cwl:tool` and checksums etc. Currently, there is no mechanism to gather the produced information as a result of a workflow run into one bundle for future use. We propose to demonstrate the implementation of a module for CWL.

Details

Language :
English
Database :
OpenAIRE
Journal :
Khan, F Z, Soiland-Reyes, S, Lonie, A & Sinnott, R 2017, ' CWL+Research Object == Complete Provenance ' Bioinformatics Open Source Conference (BOSC) 2017, Prague, Czech Republic, 22/07/17-23/07/17, . DOI: 10.7490/f1000research.1114781.1
Accession number :
edsair.od......3818..618b9b6cbbb9a8955f3bb1616ce7cbb2