Back to Search
Start Over
Phylo2Vec: a vector representation for binary trees
- Publication Year :
- 2023
- Publisher :
- arXiv, 2023.
-
Abstract
- Binary phylogenetic trees inferred from biological data are central to understanding the shared evolutionary history of organisms. Inferring the placement of latent nodes in a tree by any optimality criterion (e.g., maximum likelihood) is an NP-hard problem, propelling the development of myriad heuristic approaches. Yet, these heuristics often lack a systematic means of uniformly sampling random trees or effectively exploring a tree space that grows factorially, which are crucial to optimisation problems such as machine learning. Accordingly, we present Phylo2Vec, a new parsimonious representation of a phylogenetic tree. Phylo2Vec maps any binary tree with $n$ leaves to an integer vector of length $n$. We prove that Phylo2Vec is both well-defined and bijective to the space of phylogenetic trees. The advantages of Phylo2Vec are twofold: i) easy uniform sampling of binary trees and ii) systematic ability to traverse tree space in very large or small jumps. As a proof of concept, we use Phylo2Vec for maximum likelihood inference on five real-world datasets and show that a simple hill climbing-based optimisation efficiently traverses the vastness of tree space from a random to an optimal tree.<br />Comment: 20 pages, 6 figures
Details
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....c06290a5dfeec0af01beef624ece23ae
- Full Text :
- https://doi.org/10.48550/arxiv.2304.12693