1. On computing the maximum parsimony score of a phylogenetic network
- Author
-
Celine Scornavacca, Mareike Fischer, Steven Kelk, Leo van Iersel, RS: FSE DACS BMI, DKE Scientific staff, Ernst-Moritz-Arndt-Universität Greifswald, Delft University of Technology (TU Delft), Department of data science and Knowledge Engineering [Maastricht], Maastricht University [Maastricht], Institut des Sciences de l'Evolution de Montpellier (UMR ISEM), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-École Pratique des Hautes Études (EPHE), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Université de Montpellier (UM)-Institut de recherche pour le développement [IRD] : UR226-Centre National de la Recherche Scientifique (CNRS), and Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-École pratique des hautes études (EPHE)
- Subjects
General Mathematics ,F.2 ,0206 medical engineering ,G.2 ,approximability ,Inference ,02 engineering and technology ,Network topology ,parsimony ,Combinatorics ,03 medical and health sciences ,phylogenetic networks ,FOS: Mathematics ,Mathematics - Combinatorics ,[INFO]Computer Science [cs] ,Quantitative Biology - Populations and Evolution ,Integer programming ,030304 developmental biology ,Mathematics ,Discrete mathematics ,0303 health sciences ,Phylogenetic tree ,AMS subject classifications. 68W25, 05C20, 90C27, 92B10 ,software ,Populations and Evolution (q-bio.PE) ,Phylogenetic network ,Tree (graph theory) ,Maximum parsimony ,phylogenetic trees ,Tree rearrangement ,FOS: Biological sciences ,fixed-parameter tractability ,Combinatorics (math.CO) ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,complexity ,Algorithm ,020602 bioinformatics - Abstract
International audience; Phylogenetic networks are used to display the relationship among different species whose evolution is not treelike, which is the case, for instance, in the presence of hybridization events or horizontal gene transfers. Tree inference methods such as maximum parsimony need to be modified in order to be applicable to networks. In this paper, we discuss two different definitions of maximum parsimony on networks, “hardwired” and “softwired,” and examine the complexity of computing them given a network topology and a character. By exploiting a link with the problem Multiterminal Cut, we show that computing the hardwired parsimony score for 2-state characters is polynomial-time solvable, while for characters with more states this problem becomes NP-hard but is still approximable and fixed parameter tractable in the parsimony score. On the other hand we show that, for the softwired definition, obtaining even weak approximation guarantees is already difficult for binary characters and restricted network topologies, and fixed-parameter tractable algorithms in the parsimony score are unlikely. On the positive side we show that computing the softwired parsimony score is fixed-parameter tractable in the level of the network, a natural parameter describing how tangled reticulate activity is in the network. Finally, we show that both the hardwired and the softwired parsimony scores can be computed efficiently using integer linear programming. The software has been made freely available.
- Published
- 2015