1. The metaRbolomics Toolbox in Bioconductor and beyond
- Author
-
Michael A. Stravs, Jan Stanstrup, Etienne A. Thévenot, Kristian Peters, Tobias Schulze, Ewy Mathé, Michael Witting, Emma L. Schymanski, Thomas Naake, Johannes Rainer, Rick Helmus, Reza M. Salek, Ralf J. M. Weber, L Nicolotti, Hendrik Treutler, Egon Willighagen, Steffen Neumann, Nils Hoffmann, and Corey D. Broeckling
- Subjects
Compound identification ,Computer science ,Endocrinology, Diabetes and Metabolism ,lcsh:QR1-502 ,Bioconductor ,Statistical data analysis ,Review ,computer.software_genre ,01 natural sciences ,Biochemistry ,lcsh:Microbiology ,feature selection ,Faculty of Science ,0303 health sciences ,AN R PACKAGE ,mass Spectrometry ,metabolomics ,Toolbox ,ddc ,FLOW-INJECTION ,Feature selection ,Data integration ,User interface ,statistical data analysis ,Workflow management system ,Signal processing ,HUMAN METABOLOME DATABASE ,metabolite networks ,HIGH-THROUGHPUT ,03 medical and health sciences ,NMR spectroscopy ,CRAN ,PEAK DETECTION ,Metabolomics ,Human Metabolome Database ,signal processing ,Molecular Biology ,data integration ,030304 developmental biology ,Mass spectrometry ,010401 analytical chemistry ,OPEN SOURCE SOFTWARE ,bioconductor ,Data science ,Lipidomics ,Mass Spectrometry ,Nmr Spectroscopy ,R ,Cran ,Signal Processing ,Statistical Data Analysis ,Feature Selection ,Compound Identification ,Metabolite Networks ,Data Integration ,0104 chemical sciences ,MASS-SPECTROMETRY DATA ,Workflow ,Scripting language ,Metabolite networks ,compound identification ,lipidomics ,FEATURE-SELECTION ,DIFFERENTIAL NETWORK ANALYSIS ,MISSING VALUE IMPUTATION ,computer - Abstract
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.
- Published
- 2019