Back to Search
Start Over
Statistically Consistent k-mer Methods for Phylogenetic Tree Reconstruction.
- Source :
-
Journal of Computational Biology . Feb2017, Vol. 24 Issue 2, p153-171. 19p. - Publication Year :
- 2017
-
Abstract
- Frequencies of k-mers in sequences are sometimes used as a basis for inferring phylogenetic trees without first obtaining a multiple sequence alignment. We show that a standard approach of using the squared Euclidean distance between k-mer vectors to approximate a tree metric can be statistically inconsistent. To remedy this, we derive model-based distance corrections for orthologous sequences without gaps, which lead to consistent tree inference. The identifiability of model parameters from k-mer frequencies is also studied. Finally, we report simulations showing that the corrected distance outperforms many other k-mer methods, even when sequences are generated with an insertion and deletion process. These results have implications for multiple sequence alignment as well since k-mer methods are usually the first step in constructing a guide tree for such algorithms. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 10665277
- Volume :
- 24
- Issue :
- 2
- Database :
- Academic Search Index
- Journal :
- Journal of Computational Biology
- Publication Type :
- Academic Journal
- Accession number :
- 121037191
- Full Text :
- https://doi.org/10.1089/cmb.2015.0216