1. iRODS metadata management for a cancer genome analysis workflow
- Author
-
Lech Nieroda, Lukas Maas, Scott Thiebes, Ulrich Lang, Ali Sunyaev, Viktor Achter, and Martin Peifer
- Subjects
Next generation sequencing (NGS) ,Genome analysis ,iRODS ,Workflow integration ,High performance computing (HPC) ,Data security ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background The massive amounts of data from next generation sequencing (NGS) methods pose various challenges with respect to data security, storage and metadata management. While there is a broad range of data analysis pipelines, these challenges remain largely unaddressed to date. Results We describe the integration of the open-source metadata management system iRODS (Integrated Rule-Oriented Data System) with a cancer genome analysis pipeline in a high performance computing environment. The system allows for customized metadata attributes as well as fine-grained protection rules and is augmented by a user-friendly front-end for metadata input. This results in a robust, efficient end-to-end workflow under consideration of data security, central storage and unified metadata information. Conclusions Integrating iRODS with an NGS data analysis pipeline is a suitable method for addressing the challenges of data security, storage and metadata management in NGS environments.
- Published
- 2019
- Full Text
- View/download PDF