Back to Search Start Over

TreeLign

Authors :
Shaojie Zhang
Aaron L. Halpern
Yuan Li
Source :
BCB
Publication Year :
2011
Publisher :
ACM, 2011.

Abstract

Phylogenetic assignment of 16s rRNA has been frequently used for taxonomic classification. Recently, high-throughput sequencing, especially in the context of environmental or metagenomic sequencing projects, has made fast and accurate taxonomic classification an important goal. Existing classification methods are either fast, but too coarse-grained and inaccurate or fine-grained and accurate but too slow for use in practice. In this paper, we propose a new computational method, TreeLign, to rapidly and accurately conduct alignment and phylogenetic assignments for novel sequences, given a reference phylogenetic tree and an alignment. TreeLign first constructs profiles of every branch on the reference tree, then, for each query sequence, tries assigning it to every possible branch, and finally obtains a new tree and a new alignment which are jointly optimal in terms of Maximum Parsimony (MP). We tested the accuracy and robustness of TreeLign on both a large and a small 16S rRNA dataset extracted from the core set of GreenGenes. The results on the large dataset show that the assignments of TreeLign are in general consistent with the phylogenetic tree of the core set of GreenGenes. And, the results on the small dataset show that TreeLign achieves comparable accuracy compared with existing maximum likelihood based methods, but requires much less computational time.

Details

Database :
OpenAIRE
Journal :
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Accession number :
edsair.doi...........7a2cb8c950c243d97b27254a10868902
Full Text :
https://doi.org/10.1145/2147805.2147868