Back to Search
Start Over
Measuring associations between the microbiota and repeated measures of continuous clinical variables using a lasso-penalized generalized linear mixed model
- Source :
- BioData Mining, BioData Mining, Vol 11, Iss 1, Pp 1-20 (2018), BioData mining, vol 11, iss 1
- Publication Year :
- 2018
- Publisher :
- BioMed Central, 2018.
-
Abstract
- Background Human microbiome studies in clinical settings generally focus on distinguishing the microbiota in health from that in disease at a specific point in time. However, microbiome samples may be associated with disease severity or continuous clinical health indicators that are often assessed at multiple time points. While the temporal data from clinical and microbiome samples may be informative, analysis of this type of data can be problematic for standard statistical methods. Results To identify associations between microbiota and continuous clinical variables measured repeatedly in two studies of the respiratory tract, we adapted a statistical method, the lasso-penalized generalized linear mixed model (LassoGLMM). LassoGLMM can screen for associated clinical variables, incorporate repeated measures of individuals, and address the large number of species found in the microbiome. As is common in microbiome studies, when the number of variables is an order of magnitude larger than the number of samples LassoGLMM can be imperfect in its variable selection. We overcome this limitation by adding a pre-screening step to reduce the number of variables evaluated in the model. We assessed the use of this adapted two-stage LassoGLMM for its ability to determine which microbes are associated with continuous repeated clinical measures. We found associations (retaining a non-zero coefficient in the LassoGLMM) between 10 laboratory measurements and 43 bacterial genera in the oral microbiota, and between 2 cytokines and 3 bacterial genera in the lung. We compared our associations with those identified by the Wilcoxon test after dichotomizing our outcomes and identified a non-significant trend towards differential abundance between high and low outcomes. Our two-step LassoGLMM explained more of the variance seen in the outcome of interest than other variants of the LassoGLMM method. Conclusions We demonstrated a method that can account for the large number of genera detected in microbiome studies and repeated measures of clinical or longitudinal studies, allowing for the detection of strong associations between microbes and clinical measures. By incorporating the design strengths of repeated measurements and a prescreening step to aid variable selection, our two-step LassoGLMM will be a useful analytic method for investigating relationships between microbes and repeatedly measured continuous outcomes. Electronic supplementary material The online version of this article (10.1186/s13040-018-0173-9) contains supplementary material, which is available to authorized users.
- Subjects :
- 0301 basic medicine
16S
Continuous outcomes
Artificial Intelligence and Image Processing
Wilcoxon signed-rank test
Feature selection
Disease
lcsh:Analysis
Medical Biochemistry and Metabolomics
Biology
lcsh:Computer applications to medicine. Medical informatics
Biochemistry
Generalized linear mixed model
03 medical and health sciences
Lasso (statistics)
Clinical Research
Statistics
Genetics
Microbiome
Dental/Oral and Craniofacial Disease
Lung
Molecular Biology
Research
Microbiota
Human microbiome
Repeated measures design
lcsh:QA299.6-433
3. Good health
Computer Science Applications
Computational Mathematics
030104 developmental biology
Computational Theory and Mathematics
Specialist Studies in Education
lcsh:R858-859.7
Repeated measures
ITS
Lasso
GLMM
Subjects
Details
- Language :
- English
- ISSN :
- 17560381
- Volume :
- 11
- Database :
- OpenAIRE
- Journal :
- BioData Mining
- Accession number :
- edsair.doi.dedup.....b8491f9176c857a99ad4a51b94728027