Back to Search Start Over

Insertions and deletions as phylogenetic signal in an alignment-free context.

Authors :
Birth, Niklas
Dencker, Thomas
Morgenstern, Burkhard
Source :
PLoS Computational Biology. 8/8/2022, Vol. 18 Issue 8, p1-16. 16p. 5 Diagrams, 4 Charts.
Publication Year :
2022

Abstract

Most methods for phylogenetic tree reconstruction are based on sequence alignments; they infer phylogenies from substitutions that may have occurred at the aligned sequence positions. Gaps in alignments are usually not employed as phylogenetic signal. In this paper, we explore an alignment-free approach that uses insertions and deletions (indels) as an additional source of information for phylogeny inference. For a set of four or more input sequences, we generate so-called quartet blocks of four putative homologous segments each. For pairs of such quartet blocks involving the same four sequences, we compare the distances between the two blocks in these sequences, to obtain hints about indels that may have happened between the blocks since the respective four sequences have evolved from their last common ancestor. A prototype implementation that we call Gap-SpaM is presented to infer phylogenetic trees from these data, using a quartet-tree approach or, alternatively, under the maximum-parsimony paradigm. This approach should not be regarded as an alternative to established methods, but rather as a complementary source of phylogenetic information. Interestingly, however, our software is able to produce phylogenetic trees from putative indels alone that are comparable to trees obtained with existing alignment-free methods. Author summary: Phylogenetic tree inference based on DNA or protein sequence comparison is a fundamental task in computational biology. Given a multiple alignment of a set of input sequences, most approaches compare aligned sequence positions to each other, to find a suitable tree, based on a model of molecular evolution. Insertions and deletions that may have happened since the input sequences evolved from their last common ancestor are ignored by most phylogeny methods. Herein, we show that insertions and deletions can provide an additional source of information for phylogeny inference, and that such information can be obtained with a simple alignment-free approach. We provide an implementation of this idea that we call Gap-SpaM. The proposed approach is complementary to existing phylogeny methods since it is based on a completely different source of information. It is, thus, not meant to be an alternative to those existing methods but rather as a possible additional source of information for tree inference. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1553734X
Volume :
18
Issue :
8
Database :
Academic Search Index
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
158409635
Full Text :
https://doi.org/10.1371/journal.pcbi.1010303