Back to Search Start Over

Analyzing differences between microbiome communities using mixture distributions.

Authors :
Shestopaloff K
Escobar MD
Xu W
Source :
Statistics in medicine [Stat Med] 2018 Nov 30; Vol. 37 (27), pp. 4036-4053. Date of Electronic Publication: 2018 Jul 23.
Publication Year :
2018

Abstract

In this paper, we present a method to assess differences between microbiome communities that effectively models sparse count data and accounts for presence-absence bias frequently encountered when zeros are present. We assume that the observed data for each operational taxonomic unit is Poisson generated with the rate for each sample originating from an underlying rate distribution. We propose to model this distribution using a mixture model, specifying the components based on the posterior rate distribution of a count and estimating the optimal weights using a least squares objective function. The distribution incorporates varying resolutions of samples, a point mass for differentiating structural and nonstructural zeros, and a truncation point mass to account for high values that are too sparse to model. As mixture component specification is not always straightforward, a method to estimate a joint model from several mixture distributions using minimum distances of bootstrap iterates is proposed. Once the population rate distribution is approximated, we obtain sample-specific distributions by conditioning on the observed operational taxonomic unit count, resolution, and estimated mixture distribution and then use these to estimate pairwise distances for a permutation test. The method gives an accurate estimate of the true proportion of zeros for presence-absence, effectively models the distribution of low counts using the mixture distribution, and achieves good power for detecting differences in a variety of scenarios. The method is tested using a simulation study and applied to two microbiome datasets. In each case, the results are compared with a number of existing methods.<br /> (© 2018 John Wiley & Sons, Ltd.)

Details

Language :
English
ISSN :
1097-0258
Volume :
37
Issue :
27
Database :
MEDLINE
Journal :
Statistics in medicine
Publication Type :
Academic Journal
Accession number :
30039541
Full Text :
https://doi.org/10.1002/sim.7896