Back to Search Start Over

Completing gene trees without species trees in sub-quadratic time.

Authors :
Mai, Uyen
Mirarab, Siavash
Source :
Bioinformatics; 3/15/2022, Vol. 38 Issue 6, p1532-1541, 10p
Publication Year :
2022

Abstract

Motivation As genome-wide reconstruction of phylogenetic trees becomes more widespread, limitations of available data are being appreciated more than ever before. One issue is that phylogenomic datasets are riddled with missing data, and gene trees, in particular, almost always lack representatives from some species otherwise available in the dataset. Since many downstream applications of gene trees require or can benefit from access to complete gene trees, it will be beneficial to algorithmically complete gene trees. Also, gene trees are often unrooted, and rooting them is useful for downstream applications. While completing and rooting a gene tree with respect to a given species tree has been studied, those problems are not studied in depth when we lack such a reference species tree. Results We study completion of gene trees without a need for a reference species tree. We formulate an optimization problem to complete the gene trees while minimizing their quartet distance to the given set of gene trees. We extend a seminal algorithm by Brodal et al. to solve this problem in quasi-linear time. In simulated studies and on a large empirical data, we show that completion of gene trees using other gene trees is relatively accurate and, unlike the case where a species tree is available, is unbiased. Availability and implementation Our method, tripVote, is available at https://github.com/uym2/tripVote. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13674803
Volume :
38
Issue :
6
Database :
Complementary Index
Journal :
Bioinformatics
Publication Type :
Academic Journal
Accession number :
155584900
Full Text :
https://doi.org/10.1093/bioinformatics/btab875