Back to Search
Start Over
R |$^{2}$| s for Correlated Data: Phylogenetic Models, LMMs, and GLMMs.
- Source :
-
Systematic Biology . Mar2019, Vol. 68 Issue 2, p234-251. 18p. - Publication Year :
- 2019
-
Abstract
- Many researchers want to report an |$R^{2}$| to measure the variance explained by a model. When the model includes correlation among data, such as phylogenetic models and mixed models, defining an |$R^{2}$| faces two conceptual problems. (i) It is unclear how to measure the variance explained by predictor (independent) variables when the model contains covariances. (ii) Researchers may want the |$R^{2}$| to include the variance explained by the covariances by asking questions such as "How much of the data is explained by phylogeny?" Here, I investigated three |$R^{2}$| s for phylogenetic and mixed models. |$R^{2}_{resid}$| is an extension of the ordinary least-squares |$R^{2}$| that weights residuals by variances and covariances estimated by the model; it is closely related to |$R^{2}_{glmm}$| presented by Nakagawa and Schielzeth (2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol. Evol. 4:133–142). |$R^{2}_{pred}$| is based on predicting each residual from the fitted model and computing the variance between observed and predicted values. |$R^{2}_{lik}$| is based on the likelihood of fitted models, and therefore, reflects the amount of information that the models contain. These three |$R^{2}$| s are formulated as partial |$R^{2}$| s, making it possible to compare the contributions of predictor variables and variance components (phylogenetic signal and random effects) to the fit of models. Because partial |$R^{2}$| s compare a full model with a reduced model without components of the full model, they are distinct from marginal |$R^{2}$| s that partition additive components of the variance. I assessed the properties of the |$R^{2}$| s for phylogenetic models using simulations for continuous and binary response data (phylogenetic generalized least squares and phylogenetic logistic regression). Because the |$R^{2}$| s are designed broadly for any model for correlated data, I also compared |$R^{2}$| s for linear mixed models and generalized linear mixed models. |$R^{2}_{resid}$|, |$R^{2}_{pred}$|, and |$R^{2}_{lik}$| all have similar performance in describing the variance explained by different components of models. However, |$R^{2}_{pred}$| gives the most direct answer to the question of how much variance in the data is explained by a model. |$R^{2}_{resid}$| is most appropriate for comparing models fit to different data sets, because it does not depend on sample sizes. And |$R^{2}_{lik}$| is most appropriate to assess the importance of different components within the same model applied to the same data, because it is most closely associated with statistical significance tests. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 10635157
- Volume :
- 68
- Issue :
- 2
- Database :
- Academic Search Index
- Journal :
- Systematic Biology
- Publication Type :
- Academic Journal
- Accession number :
- 134635526
- Full Text :
- https://doi.org/10.1093/sysbio/syy060