Back to Search Start Over

Assessing approaches for inferring species trees from multi-copy genes.

Authors :
Chaudhary R
Boussau B
Burleigh JG
Fernández-Baca D
Source :
Systematic biology [Syst Biol] 2015 Mar; Vol. 64 (2), pp. 325-39. Date of Electronic Publication: 2014 Dec 23.
Publication Year :
2015

Abstract

With the availability of genomic sequence data, there is increasing interest in using genes with a possible history of duplication and loss for species tree inference. Here we assess the performance of both nonprobabilistic and probabilistic species tree inference approaches using gene duplication and loss and coalescence simulations. We evaluated the performance of gene tree parsimony (GTP) based on duplication (Only-dup), duplication and loss (Dup-loss), and deep coalescence (Deep-c) costs, the NJst distance method, the MulRF supertree method, and PHYLDOG, which jointly estimates gene trees and species tree using a hierarchical probabilistic model. We examined the effects of gene tree and species sampling, gene tree error, and duplication and loss rates on the accuracy of phylogenetic estimates. In the 10-taxon duplication and loss simulation experiments, MulRF is more accurate than the other methods when the duplication and loss rates are low, and Dup-loss is generally the most accurate when the duplication and loss rates are high. PHYLDOG performs well in 10-taxon duplication and loss simulations, but its run time is prohibitively long on larger data sets. In the larger duplication and loss simulation experiments, MulRF outperforms all other methods in experiments with at most 100 taxa; however, in the larger simulation, Dup-loss generally performs best. In all duplication and loss simulation experiments with more than 10 taxa, all methods perform better with more gene trees and fewer missing sequences, and they are all affected by gene tree error. Our results also highlight high levels of error in estimates of duplications and losses from GTP methods and demonstrate the usefulness of methods based on generic tree distances for large analyses.<br /> (© The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.)

Details

Language :
English
ISSN :
1076-836X
Volume :
64
Issue :
2
Database :
MEDLINE
Journal :
Systematic biology
Publication Type :
Academic Journal
Accession number :
25540456
Full Text :
https://doi.org/10.1093/sysbio/syu128