Back to Search Start Over

Mixture model normalization for non-targeted gas chromatography/mass spectrometry metabolomics data.

Authors :
Reisetter, Anna C.
Muehlbauer, Michael J.
Bain, James R.
Nodzenski, Michael
Stevens, Robert D.
Ilkayeva, Olga
Metzger, Boyd E.
Newgard, Christopher B.
Lowe Jr., William L.
Scholtens, Denise M.
Source :
BMC Bioinformatics; 2/2/2017, Vol. 18, p1-17, 17p, 1 Diagram, 2 Charts, 8 Graphs
Publication Year :
2017

Abstract

Background: Metabolomics offers a unique integrative perspective for health research, reflecting genetic and environmental contributions to disease-related phenotypes. Identifying robust associations in population-based or large-scale clinical studies demands large numbers of subjects and therefore sample batching for gas-chromatography/ mass spectrometry (GC/MS) non-targeted assays. When run over weeks or months, technical noise due to batch and run-order threatens data interpretability. Application of existing normalization methods to metabolomics is challenged by unsatisfied modeling assumptions and, notably, failure to address batch-specific truncation of low abundance compounds. Results: To curtail technical noise and make GC/MS metabolomics data amenable to analyses describing biologically relevant variability, we propose mixture model normalization (mixnorm) that accommodates truncated data and estimates per-metabolite batch and run-order effects using quality control samples. Mixnorm outperforms other approaches across many metrics, including improved correlation of non-targeted and targeted measurements and superior performance when metabolite detectability varies according to batch. For some metrics, particularly when truncation is less frequent for a metabolite, mean centering and median scaling demonstrate comparable performance to mixnorm. Conclusions: When quality control samples are systematically included in batches, mixnorm is uniquely suited to normalizing non-targeted GC/MS metabolomics data due to explicit accommodation of batch effects, run order and varying thresholds of detectability. Especially in large-scale studies, normalization is crucial for drawing accurate conclusions from non-targeted GC/MS metabolomics data. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14712105
Volume :
18
Database :
Complementary Index
Journal :
BMC Bioinformatics
Publication Type :
Academic Journal
Accession number :
122966914
Full Text :
https://doi.org/10.1186/s12859-017-1501-7