Back to Search Start Over

Measuring associations between the microbiota and repeated measures of continuous clinical variables using a lasso-penalized generalized linear mixed model

Authors :
Elodie Ghedin
Steven R. Duncan
Karen T. Cuenco
Laura Tipton
Alison Morris
Ruth M. Greenblatt
Frank C. Sciurba
Michael P. Donahoe
Laurence Huang
Eric C. Kleerup
Source :
BioData Mining, BioData Mining, Vol 11, Iss 1, Pp 1-20 (2018), BioData mining, vol 11, iss 1
Publication Year :
2018
Publisher :
BioMed Central, 2018.

Abstract

Background Human microbiome studies in clinical settings generally focus on distinguishing the microbiota in health from that in disease at a specific point in time. However, microbiome samples may be associated with disease severity or continuous clinical health indicators that are often assessed at multiple time points. While the temporal data from clinical and microbiome samples may be informative, analysis of this type of data can be problematic for standard statistical methods. Results To identify associations between microbiota and continuous clinical variables measured repeatedly in two studies of the respiratory tract, we adapted a statistical method, the lasso-penalized generalized linear mixed model (LassoGLMM). LassoGLMM can screen for associated clinical variables, incorporate repeated measures of individuals, and address the large number of species found in the microbiome. As is common in microbiome studies, when the number of variables is an order of magnitude larger than the number of samples LassoGLMM can be imperfect in its variable selection. We overcome this limitation by adding a pre-screening step to reduce the number of variables evaluated in the model. We assessed the use of this adapted two-stage LassoGLMM for its ability to determine which microbes are associated with continuous repeated clinical measures. We found associations (retaining a non-zero coefficient in the LassoGLMM) between 10 laboratory measurements and 43 bacterial genera in the oral microbiota, and between 2 cytokines and 3 bacterial genera in the lung. We compared our associations with those identified by the Wilcoxon test after dichotomizing our outcomes and identified a non-significant trend towards differential abundance between high and low outcomes. Our two-step LassoGLMM explained more of the variance seen in the outcome of interest than other variants of the LassoGLMM method. Conclusions We demonstrated a method that can account for the large number of genera detected in microbiome studies and repeated measures of clinical or longitudinal studies, allowing for the detection of strong associations between microbes and clinical measures. By incorporating the design strengths of repeated measurements and a prescreening step to aid variable selection, our two-step LassoGLMM will be a useful analytic method for investigating relationships between microbes and repeatedly measured continuous outcomes. Electronic supplementary material The online version of this article (10.1186/s13040-018-0173-9) contains supplementary material, which is available to authorized users.

Details

Language :
English
ISSN :
17560381
Volume :
11
Database :
OpenAIRE
Journal :
BioData Mining
Accession number :
edsair.doi.dedup.....b8491f9176c857a99ad4a51b94728027