Back to Search Start Over

Robust expansion of phylogeny for fast-growing genome sequence data.

Authors :
Yongtao Ye
Marcus H Shum
Joseph L Tsui
Guangchuang Yu
David K Smith
Huachen Zhu
Joseph T Wu
Yi Guan
Tommy Tsan-Yuk Lam
Source :
PLoS Computational Biology, Vol 20, Iss 2, p e1011871 (2024)
Publication Year :
2024
Publisher :
Public Library of Science (PLoS), 2024.

Abstract

Massive sequencing of SARS-CoV-2 genomes has urged novel methods that employ existing phylogenies to add new samples efficiently instead of de novo inference. 'TIPars' was developed for such challenge integrating parsimony analysis with pre-computed ancestral sequences. It took about 21 seconds to insert 100 SARS-CoV-2 genomes into a 100k-taxa reference tree using 1.4 gigabytes. Benchmarking on four datasets, TIPars achieved the highest accuracy for phylogenies of moderately similar sequences. For highly similar and divergent scenarios, fully parsimony-based and likelihood-based phylogenetic placement methods performed the best respectively while TIPars was the second best. TIPars accomplished efficient and accurate expansion of phylogenies of both similar and divergent sequences, which would have broad biological applications beyond SARS-CoV-2. TIPars is accessible from https://tipars.hku.hk/ and source codes are available at https://github.com/id-bioinfo/TIPars.

Subjects

Subjects :
Biology (General)
QH301-705.5

Details

Language :
English
ISSN :
1553734X and 15537358
Volume :
20
Issue :
2
Database :
Directory of Open Access Journals
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
edsdoj.025a7cd56d4e48fb8942fd98b1455ab3
Document Type :
article
Full Text :
https://doi.org/10.1371/journal.pcbi.1011871&type=printable