Back to Search Start Over

R |$^{2}$| s for Correlated Data: Phylogenetic Models, LMMs, and GLMMs.

Authors :
Ives, Anthony R
Source :
Systematic Biology. Mar2019, Vol. 68 Issue 2, p234-251. 18p.
Publication Year :
2019

Abstract

Many researchers want to report an |$R^{2}$| to measure the variance explained by a model. When the model includes correlation among data, such as phylogenetic models and mixed models, defining an |$R^{2}$| faces two conceptual problems. (i) It is unclear how to measure the variance explained by predictor (independent) variables when the model contains covariances. (ii) Researchers may want the |$R^{2}$| to include the variance explained by the covariances by asking questions such as "How much of the data is explained by phylogeny?" Here, I investigated three |$R^{2}$| s for phylogenetic and mixed models. |$R^{2}_{resid}$| is an extension of the ordinary least-squares |$R^{2}$| that weights residuals by variances and covariances estimated by the model; it is closely related to |$R^{2}_{glmm}$| presented by Nakagawa and Schielzeth (2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods Ecol. Evol. 4:133–142). |$R^{2}_{pred}$| is based on predicting each residual from the fitted model and computing the variance between observed and predicted values. |$R^{2}_{lik}$| is based on the likelihood of fitted models, and therefore, reflects the amount of information that the models contain. These three |$R^{2}$| s are formulated as partial |$R^{2}$| s, making it possible to compare the contributions of predictor variables and variance components (phylogenetic signal and random effects) to the fit of models. Because partial |$R^{2}$| s compare a full model with a reduced model without components of the full model, they are distinct from marginal |$R^{2}$| s that partition additive components of the variance. I assessed the properties of the |$R^{2}$| s for phylogenetic models using simulations for continuous and binary response data (phylogenetic generalized least squares and phylogenetic logistic regression). Because the |$R^{2}$| s are designed broadly for any model for correlated data, I also compared |$R^{2}$| s for linear mixed models and generalized linear mixed models. |$R^{2}_{resid}$|⁠, |$R^{2}_{pred}$|⁠, and |$R^{2}_{lik}$| all have similar performance in describing the variance explained by different components of models. However, |$R^{2}_{pred}$| gives the most direct answer to the question of how much variance in the data is explained by a model. |$R^{2}_{resid}$| is most appropriate for comparing models fit to different data sets, because it does not depend on sample sizes. And |$R^{2}_{lik}$| is most appropriate to assess the importance of different components within the same model applied to the same data, because it is most closely associated with statistical significance tests. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10635157
Volume :
68
Issue :
2
Database :
Academic Search Index
Journal :
Systematic Biology
Publication Type :
Academic Journal
Accession number :
134635526
Full Text :
https://doi.org/10.1093/sysbio/syy060