126 results on '"Dondi, R"'
Search Results
2. The cell-penetrating peptide chlorin conjugate, a novel class of photosensitiser for photodynamic therapy and photochemical internalisation: O124
- Author
-
Alade, A. O., Yaghini, E., Dondi, R., Eggleston, I. E., and MacRobert, A. J.
- Published
- 2015
3. Patient Engagement: Theoretical and Heuristic Approaches for Supporting the Clinical Practice
- Author
-
Zoppis, I., Dondi, R., Manzoni, S., Giancarlo Mauri, Bandini, S, Cortellessa, G, Gorrini, A, Palumbo, F, Zoppis, I, Dondi, R, Manzoni, S, and Mauri, G
- Subjects
graph optimization ,Genetica Algoritm ,Social interaction - Abstract
Social interaction allows to support the disease management by creating online spaces where patients can interact with clinicians, and share experiences with other patients. Therefore, promoting targeed communication in online social spaces is a mean to group patients around shared goals, offer emotional support, and finally engage patients in their healthcare decision-making process. In this paper, we approach the argument from a theoretical perspective: we design an optimization problem aimed to encourage the creation of (induced) sub-networks of patients which, being recently diagnosed, wish to deepen the knowledge about their medical treatment with some other similar profiled patients, which have already been followed up by specific (even alternative) care centers. In particular, due to the computational hardness of the proposed problem, we provide approximated solutions based on distributed heuristics (i.e., Genetic Algorithms). Results are given for simulated data using Erd ̈os-R ́enyi random graphs.
- Published
- 2019
4. Top-k Overlapping Densest Subgraphs: Approximation and Complexity
- Author
-
Dondi, R., Mehdi Hosseinzadeh, Mauri, G., Zoppis, I., Dondi, R, Mehdi Hosseinzadeh, M, Mauri, G, and Zoppis, I
- Subjects
Social and Information Networks (cs.SI) ,FOS: Computer and information sciences ,Settore INF/01 - Informatica ,INF/01 - INFORMATICA ,Computer Science - Social and Information Networks ,Computational Complexity (cs.CC) ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,Computer Science - Computational Complexity ,Computer Science::Discrete Mathematics ,Computer Science - Data Structures and Algorithms ,Data Structures and Algorithms (cs.DS) ,densest subgraph ,approximation algorithms ,MathematicsofComputing_DISCRETEMATHEMATICS - Abstract
A central problem in graph mining is finding dense subgraphs, with several applications in different fields, a notable example being identifying communities. While a lot of effort has been put on the problem of finding a single dense subgraph, only recently the focus has been shifted to the problem of finding a set of densest subgraphs. Some approaches aim at finding disjoint subgraphs, while in many real-world networks communities are often overlapping. An approach introduced to find possible overlapping subgraphs is the Top-k Overlapping Densest Subgraphs problem. For a given integer k >= 1, the goal of this problem is to find a set of k densest subgraphs that may share some vertices. The objective function to be maximized takes into account both the density of the subgraphs and the distance between subgraphs in the solution. The Top-k Overlapping Densest Subgraphs problem has been shown to admit a 1/10-factor approximation algorithm. Furthermore, the computational complexity of the problem has been left open. In this paper, we present contributions concerning the approximability and the computational complexity of the problem. For the approximability, we present approximation algorithms that improves the approximation factor to 1/2 , when k is bounded by the vertex set, and to 2/3 when k is a constant. For the computational complexity, we show that the problem is NP-hard even when k = 3.
- Published
- 2019
5. An Explainable Recommender System for WhoTeach Educational Platform
- Author
-
Manzoni S., Zoppis I., Mauri G., Dondi R., Marconi L Francesco Epifania, Manzoni, S, Zoppis, I, Mauri, G, Dondi, R, and Marconi, L
- Subjects
Social Networks, Optimized Social Explanation, Communities Identification, Genetic Algorithms, WhoTeach - Published
- 2019
6. Optimizing social interaction a computational approach to support patient engagement
- Author
-
Zoppis, I., Dondi, R., Santoro, E., Castelnuovo, G., Sicurello, F., Giancarlo Mauri, Mauri, G, Sicurello, F, Castelnuovo, G, Santoro, E, Dondi, R, and Zoppis, I
- Subjects
Optimization ,Settore INF/01 - Informatica ,Social Networks ,Cohesive Sub-Graphs ,Genetic Algorithms ,INF/01 - INFORMATICA ,Settore M-PSI/08 - PSICOLOGIA CLINICA ,Social Networks, Optimization, Cohesive Sub-Graphs, Genetic Algorithms ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI - Abstract
Social media can directly support disease management by creating online spaces where patients can interact with clinicians, and share experiences with other patients. Nevertheless, much more work remains to be carried out for providing and sharing an optimized information content. In this paper we formulate, from a theoretical perspective, an optimization problem aimed to encourage the creation of a sub-network of patients which, being recently diagnosed, wish to deepen their knowledge about their pathologies with some other patients, whose clinical profile turn to be similar, and have already been followed up within specific, even alternative, care centers. We will focus on the hardness of the proposed problem and provide a Genetic Algorithm (GAbased) approach to seek faster approximated solutions.
- Published
- 2018
7. The longest filled common subsequence problem
- Author
-
Castelli, M, Dondi, R, MAURI, GIANCARLO, ZOPPIS, ITALO FRANCESCO, NOVA Information Management School (NOVA IMS), Information Management Research Center (MagIC) - NOVA Information Management School, Kärkkäinen, J, Radoszewski, J, Rytter, W, Castelli, M, Dondi, R, Mauri, G, and Zoppis, I
- Subjects
000 Computer science, knowledge, general works ,Longest common subsequence ,Settore INF/01 - Informatica ,Approximation algorithms ,Computational complexity ,Fixed-parameter algorithms ,Software ,INF/01 - INFORMATICA ,0102 computer and information sciences ,02 engineering and technology ,Approximation algorithm ,01 natural sciences ,010201 computation theory & mathematics ,Fixed-parameter algorithm ,Computer Science ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing - Abstract
Castelli, M., Dondi, R., Mauri, G., & Zoppis, I. (2017). The longest filled common subsequence problem. In 28th Annual Symposium on Combinatorial Pattern Matching, CPM 2017 (Vol. 78). [14] Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. https://doi.org/10.4230/LIPIcs.CPM.2017.14 Inspired by a recent approach for genome reconstruction from incomplete data, we consider a variant of the longest common subsequence problem for the comparison of two sequences, one of which is incomplete, i.e. it has some missing elements. The new combinatorial problem, called Longest Filled Common Subsequence, given two sequences A and B, and a multiset M of symbols missing in B, asks for a sequence B∗ obtained by inserting the symbols of M into B so that B∗ induces a common subsequence with A of maximum length. First, we investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol. Then, we give a 3/5-approximation algorithm for the problem. Finally, we present a fixed-parameter algorithm, when the problem is parameterized by the number of symbols inserted in B that "match" symbols of A. publishersversion published
- Published
- 2017
8. Preface
- Author
-
Dondi, R, Fertin, G, MAURI, GIANCARLO, Dondi, R, Fertin, G, and Mauri, G
- Subjects
Computer Science (all) ,INF/01 - INFORMATICA ,Theoretical Computer Science - Published
- 2016
9. Peptide-targeted prodrugs for aminolaevulinic acid photodynamic therapy
- Author
-
Tewari, K.M., primary, Yaghini, E., additional, Reelfs, O., additional, Dondi, R., additional, Pourzand, C., additional, MacRobert, A.J., additional, and Eggleston, I.M., additional
- Published
- 2017
- Full Text
- View/download PDF
10. Farfalla: a step toward an inclusive web
- Author
-
MANGIATORDI, ANDREA, Dondi, R., Mangiatordi, A, and Dondi, R
- Subjects
M-PED/03 - DIDATTICA E PEDAGOGIA SPECIALE ,accessibility, web applications, assistive technology, web 2.0, architectural barriers ,INF/01 - INFORMATICA - Abstract
The concept of Architectural Barrier can be extended to the Web, in the sense that people with disability can experience problems in accessing pages which are designed for the able-bodied. Despite of this, many users with different kind of disabilities regularly access the web using some kind of Assistive Technology. This means they can use a variety of resources and web 2.0 tools, but they need to be on their own computers, with their own accessibility software and configuration, in order to use them at best. The Farfalla project is a web application which aims to allow users to set their accessibility preferences on a centralized website and then to apply them to every other website they will visit. This approach can work from any web browser, on (almost) any machine, even without administrative privileges: the only requirement is a javascript-enabled browser with internet connection. In order to accomplish this, the Farfalla script only needs to be included in a target web page in the form of a small piece of HTML code. The attempt is to move the accessibility solutions from the user’s computer to the web itself; in this paper is explained why, if widespread, this practice could be the base for a truly inclusive web
- Published
- 2010
11. A PTAS for the Minimum Consensus Clustering Problem with a Fixed Number of Clusters
- Author
-
BONIZZONI, PAOLA, DELLA VEDOVA, GIANLUCA, Dondi, R., Bonizzoni, P, DELLA VEDOVA, G, and Dondi, R
- Subjects
FOS: Computer and information sciences ,Computer Science - Data Structures and Algorithms ,INF/01 - INFORMATICA ,PTAS, consensus clustering ,Data Structures and Algorithms (cs.DS) - Abstract
The Consensus Clustering problem has been introduced as an effective way to analyze the results of different microarray experiments. The problem consists of looking for a partition that best summarizes a set of input partitions (each corresponding to a different microarray experiment) under a simple and intuitive cost function. The problem admits polynomial time algorithms on two input partitions, but is APX-hard on three input partitions. We investigate the restriction of Consensus Clustering when the output partition is required to contain at most k sets, giving a polynomial time approximation scheme (PTAS) while proving the NP-hardness of this restriction.
- Published
- 2009
- Full Text
- View/download PDF
12. The Haplotyping Problem: An Overview of Computational Models and Solutions
- Author
-
BONIZZONI, PAOLA, DELLA VEDOVA, GIANLUCA, Dondi, R, Li, J., Bonizzoni, P, DELLA VEDOVA, G, Dondi, R, and Li, J
- Subjects
pure parsimony ,pedigree ,INF/01 - INFORMATICA ,haplotyping - Published
- 2007
13. Flexible synthesis of cationic peptide–porphyrin derivatives for light-triggered drug delivery and photodynamic therapy
- Author
-
Dondi, R., primary, Yaghini, E., additional, Tewari, K. M., additional, Wang, L., additional, Giuntini, F., additional, Loizidou, M., additional, MacRobert, A. J., additional, and Eggleston, I. M., additional
- Published
- 2016
- Full Text
- View/download PDF
14. Knowledge Organization and Retrieval in the MILK System
- Author
-
BOSELLI, ROBERTO, DE PAOLI, FLAVIO MARIA, Dondi, R., Boselli, R, DE PAOLI, F, and Dondi, R
- Subjects
knowledge management, MILK system, ontologies, knowledge organization ,INF/01 - INFORMATICA - Abstract
The aim of this paper is to illustrate the approach and the solutions that have been adopted by the MILK system to deliver a knowledge management environment to support workers during their every-day activities. The main goal of MILK is to collect and organize the knowledge associated with documents and projects within an organization. Profiling through metadata is the approach to classify, organize, retrieve and present knowledge. People are also profiled and organized in project teams and communities of interest to capture user needs and let the system supply users with comprehensive access to knowledge.
- Published
- 2003
15. Stimulating knowledge discovery and sharing
- Author
-
Agostini, A., primary, Albolino, S., additional, De Michelis, G., additional, De Paoli, F., additional, and Dondi, R., additional
- Published
- 2003
- Full Text
- View/download PDF
16. Stimulating knowledge discovery and sharing[28] (abstract only)
- Author
-
Agostini, A., primary, Albolino, S., additional, De Michelis, G., additional, De Paoli, F., additional, and Dondi, R., additional
- Published
- 2003
- Full Text
- View/download PDF
17. Stimulating knowledge discovery and sharing.
- Author
-
Agostini, A., Albolino, S., De Michelis, G., De Paoli, F., and Dondi, R.
- Published
- 2003
- Full Text
- View/download PDF
18. Exemplar Longest Common Subsequence.
- Author
-
Bonizzoni, P., Vedova, G.D., Dondi, R., Fertin, G., Rizzi, R., and Vialette, S.
- Abstract
In this paper, we investigate the computational and approximation complexity of the Exemplar Longest Common Subsequence (ELCS) of a set of sequences (ELCS problem), a generalization of the Longest Common Subsequence problem, where the input sequences are over the union of two disjoint sets of symbols, a set of mandatory symbols and a set of optional symbols. We show that different versions of the problem are APX-hard even for instances with two sequences. Moreover, we show that the related problem of determining the existence of a feasible solution of the ELCS of two sequences is NP-hard. On the positive side, we first present an efficient algorithm for the ELCS problem over instances of two sequences where each mandatory symbol can appear in total at most three times in the sequences. Furthermore, we present two fixed-parameter algorithms for the ELCS problem over instances of two sequences where the parameter is the number of mandatory symbols. [ABSTRACT FROM PUBLISHER]
- Published
- 2007
- Full Text
- View/download PDF
19. A library of efficient bioinformatics algorithms
- Author
-
Gianluca Della Vedova, Dondi R, DELLA VEDOVA, G, and Dondi, R
- Subjects
Algorithm ,Internet ,Librarie ,Software Design ,Databases, Genetic ,Computational Biology ,Information Storage and Retrieval ,Programming Language ,Software - Abstract
In this paper we review some of the existing projects available in the bioinformatics field for facilitating the development of programs, but for which minimising the running time is not of primary importance. We point out the advantages of open source libraries for such tasks and we discuss some of the open source licenses available. Finally, we present the project ALiBio, which is aimed at facilitating the development of efficient programs in bioinformatics.
20. Preface
- Author
-
Bures, T., Dondi, R., Gamper, J., Guerrini, G., Jurdzinski, T., Pahl, C., Sikora, F., and Wong, P. W. H.
- Published
- 1967
21. On the complexity of approximately matching a string to a directed graph
- Author
-
Riccardo Dondi, Giancarlo Mauri, Italo Zoppis, Dondi, R, Mauri, G, and Zoppis, I
- Subjects
approximate string matching ,Settore INF/01 - Informatica ,Matching (graph theory) ,Query string ,Computer science ,String (computer science) ,INF/01 - INFORMATICA ,Parameterized complexity ,Directed graph ,Approximate string matching ,Computer Science Applications ,Theoretical Computer Science ,Combinatorics ,High Energy Physics::Theory ,Computational Theory and Mathematics ,Path (graph theory) ,Graph (abstract data type) ,Information Systems - Abstract
The problem of matching a query string to a directed graph, whose vertices are labeled by strings, has application in different fields. In this paper we present results on the complexity of the approximate matching problem, where edit operations are symbol substitutions and are allowed only to the graph labels or both to the graph labels and the query string. We show that deciding if there exists a path in a graph that represents a query string with edit operations to vertex labels is an NP-complete problem and it is fixed-parameter tractable when parameterized by the length of the input string. We show that the variants of approximate string matching we consider are not fixed-parameter tractable, when the parameter is (1) the length of the query string and (2) the number of edit operations. Moreover, we provide inapproximability results for these variants of the of approximate string matching problem.
- Published
- 2022
22. Novel Results on the Number of Runs of the Burrows-Wheeler-Transform
- Author
-
Anna Toffanello, Marinella Sciortino, Nicola Prezza, Zsuzsanna Lipták, Shunsuke Inenaga, Sara Giuliani, Bureš, T, Dondi, R, Gamper, J, Guerrini, G, Jurdziński, T, Pahl, C, Sikora, F, Wong, PWH, Giuliani S., Inenaga S., Liptak Z., Prezza N., Sciortino M., and Toffanello A.
- Subjects
FOS: Computer and information sciences ,Burrows–Wheeler transform ,Settore INF/01 - Informatica ,Combinatorics on words ,Formal Languages and Automata Theory (cs.FL) ,Computer science ,String (computer science) ,Search engine indexing ,Compressed data structures ,Computer Science - Formal Languages and Automata Theory ,String indexing ,Data structure ,Measure (mathematics) ,Burrows-Wheeler-Transform ,Repetitiveness ,Burrows-Wheeler-Transform, Compressed data structures, String indexing, Repetitiveness, Combinatorics on words ,Transformation (function) ,Computer Science - Data Structures and Algorithms ,Data Structures and Algorithms (cs.DS) ,Algorithm ,Data compression - Abstract
The Burrows-Wheeler-Transform (BWT), a reversible string transformation, is one of the fundamental components of many current data structures in string processing. It is central in data compression, as well as in efficient query algorithms for sequence data, such as webpages, genomic and other biological sequences, or indeed any textual data. The BWT lends itself well to compression because its number of equal-letter-runs (usually referred to as $r$) is often considerably lower than that of the original string; in particular, it is well suited for strings with many repeated factors. In fact, much attention has been paid to the $r$ parameter as measure of repetitiveness, especially to evaluate the performance in terms of both space and time of compressed indexing data structures. In this paper, we investigate $\rho(v)$, the ratio of $r$ and of the number of runs of the BWT of the reverse of $v$. Kempa and Kociumaka [FOCS 2020] gave the first non-trivial upper bound as $\rho(v) = O(\log^2(n))$, for any string $v$ of length $n$. However, nothing is known about the tightness of this upper bound. We present infinite families of binary strings for which $\rho(v) = \Theta(\log n)$ holds, thus giving the first non-trivial lower bound on $\rho(n)$, the maximum over all strings of length $n$. Our results suggest that $r$ is not an ideal measure of the repetitiveness of the string, since the number of repeated factors is invariant between the string and its reverse. We believe that there is a more intricate relationship between the number of runs of the BWT and the string's combinatorial properties., Comment: 14 pages, 2 figues
- Published
- 2021
23. Complexity Issues of String to Graph Approximate Matching
- Author
-
Italo Zoppis, Riccardo Dondi, Giancarlo Mauri, A. Leporati, C. Martín-Vide, D. Shapira, C. Zandron, Dondi, R, Mauri, G, and Zoppis, I
- Subjects
FOS: Computer and information sciences ,050101 languages & linguistics ,Computational complexity theory ,Computer science ,Existential quantification ,Algorithms on strings, Computational complexity, Graph query, Parameterized complexity, Patterns , String to graph matching ,Binary number ,Parameterized complexity ,02 engineering and technology ,Computational Complexity (cs.CC) ,Algorithms on strings ,Article ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,Combinatorics ,Polynomial kernel ,Computer Science - Data Structures and Algorithms ,0202 electrical engineering, electronic engineering, information engineering ,Quantitative Biology - Genomics ,0501 psychology and cognitive sciences ,Data Structures and Algorithms (cs.DS) ,Patterns ,Genomics (q-bio.GN) ,Query string ,Settore INF/01 - Informatica ,String to graph matching ,05 social sciences ,INF/01 - INFORMATICA ,Computational complexity ,Graph query ,Directed graph ,Approximate string matching ,Computer Science - Computational Complexity ,FOS: Biological sciences ,020201 artificial intelligence & image processing - Abstract
The problem of matching a query string to a directed graph, whose vertices are labeled by strings, has application in different fields, from data mining to computational biology. Several variants of the problem have been considered, depending on the fact that the match is exact or approximate and, in this latter case, which edit operations are considered and where are allowed. In this paper we present results on the complexity of the approximate matching problem, where edit operations are symbol substitutions and are allowed only on the graph labels or both on the graph labels and the query string. We introduce a variant of the problem that asks whether there exists a path in a graph that represents a query string with any number of edit operations and we show that is is NP-complete, even when labels have length one and in the case the alphabet is binary. Moreover, when it is parameterized by the length of the input string and graph labels have length one, we show that the problem is fixed-parameter tractable and it is unlikely to admit a polynomial kernel. The NP-completeness of this problem leads to the inapproximability (within any factor) of the approximate matching when edit operations are allowed only on the graph labels. Moreover, we show that the variants of approximate string matching to graph we consider are not fixed-parameter tractable, when the parameter is the number of edit operations, even for graphs that have distance one from a DAG. The reduction for this latter result allows us to prove the inapproximability of the variant where edit operations can be applied both on the query string and on graph labels., Comment: Extended version of a paper accepted to LATA 2020
- Published
- 2020
- Full Text
- View/download PDF
24. Up-to Techniques for Branching Bisimilarity
- Author
-
Erkens, Rick, Rot, Jurriaan, Luttik, Bas, Chatzigeorgiou, Alexander, Dondi, Riccardo, Herodotou, Herodotos, Kapoutsis, Christos, Manolopoulos, Yannis, Papadopoulos, George A., Sikora, Florian, Formal System Analysis, EAISI Foundational, Chatzigeorgiou, A., Dondi, R., Herodotou, H., Kapoutsis, C., Manolopoulos, Y., Papadopoulos, G.A., and Sikora, F.
- Subjects
FOS: Computer and information sciences ,Computer Science - Logic in Computer Science ,Computer science ,Context (language use) ,0102 computer and information sciences ,02 engineering and technology ,01 natural sciences ,Logic in Computer Science (cs.LO) ,Branching (linguistics) ,010201 computation theory & mathematics ,020204 information systems ,Software Science ,0202 electrical engineering, electronic engineering, information engineering ,Calculus - Abstract
Ever since the introduction of behavioral equivalences on processes one has been searching for efficient proof techniques that accompany those equivalences. Both strong bisimilarity and weak bisimilarity are accompanied by an arsenal of up-to techniques: enhancements of their proof methods. For branching bisimilarity, these results have not been established yet. We show that a powerful proof technique is sound for branching bisimilarity by combining the three techniques of up to union, up to expansion and up to context for Bloom’s BB cool format. We then make an initial proposal for casting the correctness proof of the up to context technique in an abstract coalgebraic setting, covering branching but also, delay and weak bisimilarity.
- Published
- 2020
- Full Text
- View/download PDF
25. Comparing incomplete sequences via longest common subsequence
- Author
-
Riccardo Dondi, Italo Zoppis, Giancarlo Mauri, Mauro Castelli, NOVA Information Management School (NOVA IMS), NOVA IMS Research and Development Center (MagIC), Information Management Research Center (MagIC) - NOVA Information Management School, Castelli, M, Dondi, R, Mauri, G, and Zoppis, I
- Subjects
String algorithms ,General Computer Science ,Computational complexity theory ,Parameterized complexity ,0102 computer and information sciences ,02 engineering and technology ,Fixed-parameter algorithms ,01 natural sciences ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,String algorithm ,Theoretical Computer Science ,Approximation algorithms ,Computational complexity ,Longest common subsequence ,Combinatorics ,Longest common subsequence problem ,Subsequence ,0202 electrical engineering, electronic engineering, information engineering ,Time complexity ,Mathematics ,Multiset ,Sequence ,Settore INF/01 - Informatica ,Approximation algorithm ,INF/01 - INFORMATICA ,010201 computation theory & mathematics ,Fixed-parameter algorithm ,020201 artificial intelligence & image processing ,Computer Science(all) - Abstract
Castelli, M., Dondi, R., Mauri, G., & Zoppis, I. (2019). Comparing incomplete sequences via longest common subsequence. Theoretical Computer Science, 796, 272-285. https://doi.org/10.1016/j.tcs.2019.09.022 Inspired by scaffold filling, a recent approach for genome reconstruction from incomplete data, we consider a variant of the well-known longest common subsequence problem for the comparison of two sequences. The new problem, called Longest Filled Common Subsequence, aims to compare a complete sequence with an incomplete one, i.e. with some missing elements. Longest Filled Common Subsequence (LFCS), given a complete sequence A, an incomplete sequence B, and a multiset M of symbols missing in B, asks for a sequence B⁎ obtained by inserting the symbols of M into B so that B⁎ induces a common subsequence with A of maximum length. We investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol, and we give a polynomial time algorithm when the input sequences are over a constant-size alphabet. We give a [Formula presented] approximation algorithm for the Longest Filled Common Subsequence problem. Finally, we present a fixed-parameter algorithm for the problem, when it is parameterized by the number of symbols inserted in B that “match” symbols of A. authorsversion published
- Published
- 2019
26. Orthology Correction for Gene Tree Reconstruction: Theoretical and Experimental Results
- Author
-
Giancarlo Mauri, Italo Zoppis, Riccardo Dondi, Dondi, R, Mauri, G, and Zoppis, I
- Subjects
0301 basic medicine ,Theoretical computer science ,Relation (database) ,Computer science ,0206 medical engineering ,Relation graph ,02 engineering and technology ,Approximation Algorithms ,CoGraph Editing ,Gene Tree Reconstruction ,Genetic Algorithms ,Orthologous Genes ,Computer Science ,03 medical and health sciences ,Genetic algorithm ,Cograph ,Orthologous Genes, Gene Tree Reconstruction, CoGraph Editing, Approximation Algorithms, Genetic Algorithms ,General Environmental Science ,Settore INF/01 - Informatica ,Degree (graph theory) ,INF/01 - INFORMATICA ,Approximation algorithm ,Quantitative Biology::Genomics ,Tree (graph theory) ,Graph ,030104 developmental biology ,Bounded function ,General Earth and Planetary Sciences ,020602 bioinformatics - Abstract
We consider how the orthology/paralogy information can be corrected in order to represent a gene tree, a problem that has recently gained interest in phylogenomics. Interestingly, the problem is related to the Minimum CoGraph Editing problem on the relation graph that represents orthology/paralogy information, where we want to minimize the number of edit operations on the given relation graph in order to obtain a cograph. In this paper we provide both theoretical and experimental results on the Minimum CoGraph Editing problem. On the theoretical side, we provide approximation algorithms for bounded degree relation graphs, for the general problem and for the problem restricted to deletion of edges. On the experimental side, we present a genetic algorithm for Minimum CoGraph Editing and we provide an experimental evaluation of the genetic algorithm on synthetic data.
- Published
- 2017
27. Computational Methods for Resting-State EEG of Patients With Disorders of Consciousness
- Author
-
Angela Morreale, Urszula Markowska-Kacznar, Riccardo Dondi, Italo Zoppis, Giancarlo Mauri, Francesca Gasparini, Silvia Corchs, Sara Manzoni, Giovanni Chioma, Corchs, S, Chioma, G, Dondi, R, Gasparini, F, Manzoni, S, Markowska-Kacznar, U, Mauri, G, Zoppis, I, and Morreale, A
- Subjects
Decision support system ,medicine.medical_specialty ,VS ,Resting state analysis ,Mini Review ,Disorders of consciousness ,Computational methods ,Deep learning ,DOC ,EEG ,Machine learning ,MCS ,Electroencephalography ,050105 experimental psychology ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,lcsh:RC321-571 ,03 medical and health sciences ,computational methods ,0302 clinical medicine ,Medicine ,0501 psychology and cognitive sciences ,Intensive care medicine ,lcsh:Neurosciences. Biological psychiatry. Neuropsychiatry ,Coma ,machine learning ,resting state analysis ,deep learning ,Settore INF/01 - Informatica ,medicine.diagnostic_test ,business.industry ,General Neuroscience ,05 social sciences ,Minimally conscious state ,INF/01 - INFORMATICA ,Correction ,Resting state analysi ,medicine.disease ,Linear discriminant analysis ,Computational method ,humanities ,Resting state eeg ,Differential diagnosis ,medicine.symptom ,business ,030217 neurology & neurosurgery ,Neuroscience - Abstract
Patients who survive brain injuries may develop Disorders of Consciousness (DOC) such as Coma, Vegetative State (VS) or Minimally Conscious State (MCS). Unfortunately, the rate of misdiagnosis between VS and MCS due to clinical judgment is high. Therefore, diagnostic decision support systems aiming to correct any differentiation between VS and MCS are essential for the characterization of an adequate treatment and an effective prognosis. In recent decades, there has been a growing interest in the new EEG computational techniques. We have reviewed how resting-state EEG is computationally analyzed to support differential diagnosis between VS and MCS in view of applicability of these methods in clinical practice. The studies available so far have used different techniques and analyses; it is therefore hard to draw general conclusions. Studies using a discriminant analysis with a combination of various factors and reporting a cut-off are among the most interesting ones for a future clinical application.
- Published
- 2019
28. On the tractability of finding disjoint clubs in a network
- Author
-
Italo Zoppis, Riccardo Dondi, Giancarlo Mauri, Dondi, R, Mauri, G, and Zoppis, I
- Subjects
General Computer Science ,Existential quantification ,APX-hardne ,Parameterized complexity ,0102 computer and information sciences ,02 engineering and technology ,Disjoint sets ,Fixed-parameter algorithms ,01 natural sciences ,Theoretical Computer Science ,Combinatorics ,APX-hardness ,Graph problems ,0202 electrical engineering, electronic engineering, information engineering ,Graph algorithms ,Connectivity ,Mathematics ,Settore INF/01 - Informatica ,Graph algorithm ,INF/01 - INFORMATICA ,s-Clubs ,Graph ,010201 computation theory & mathematics ,Fixed-parameter algorithm ,Bounded function ,020201 artificial intelligence & image processing ,Graph problem - Abstract
We study a variant of the problem of finding a collection of disjoint s-clubs in a given network. Given a graph, the problem asks whether there exists a collection of at most r disjoint s-clubs that covers at least k vertices of the network. An s-club is a connected graph that has diameter bounded by s, for a positive integer s. We demand that each club is non-trivial, that is it has order at least t ≥ 2 , for some positive integer t. We prove that the problem is APX-hard even when the input graph has bounded degree, s = 2 , t = 3 and r = | V | . Moreover, we show that the problem is polynomial-time solvable when s ≥ 4 , t = 3 and r = | V | , and when s ≥ 3 , t = 2 and r = | V | . Finally, for s ≥ 2 , we present a fixed-parameter algorithm for the problem, when parameterized by the number of covered vertices.
- Published
- 2019
29. Kernel Methods: Support Vector Machines
- Author
-
Giancarlo Mauri, Riccardo Dondi, Italo Zoppis, Ranganathan, S, Gribskov, M, Nakai, K, Schönbach, C (Editors in Chief), Zoppis, I, Mauri, G, and Dondi, R
- Subjects
Computer Science::Machine Learning ,Support vector machine ,Computer science ,Machine learning ,computer.software_genre ,Clustering ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,Separable space ,Soft margin ,Classification ,Kernel methods ,Maximun margin hyperplane ,Regression ,Statistics::Machine Learning ,Margin (machine learning) ,Feature (machine learning) ,Cluster analysis ,Settore INF/01 - Informatica ,business.industry ,INF/01 - INFORMATICA ,Nonlinear system ,ComputingMethodologies_PATTERNRECOGNITION ,Kernel method ,Hyperplane ,Computer Science::Sound ,Artificial intelligence ,business ,computer - Abstract
Support Vector Machines (SVMs) and Kernel methods have found a natural and effective coexistence since their introduction in the early 90s. In this article, we will describe the main concepts that motivate the importance of this relationship. In fact SVMs use kernels for learning linear predictors in high dimensional feature spaces. First, we will describe intuitively how this mechanism is realized, introducing the main concepts and definitions, i.e., maximum margin hyperplane, kernels and non-linearly separable problems. Then the main mathematical issues for linear and nonlinear SVM-based classification will be detailed. We will also introduce some important extensions of the SVMs ideas, by considering the Soft Margin Classification, SVM multi-class classification, SVM clustering, and SVM regression.
- Published
- 2019
30. Covering a graph with clubs
- Author
-
Riccardo Dondi, Giancarlo Mauri, Italo Zoppis, Florian Sikora, Dondi, R, Mauri, G, Sikora, F, Zoppis, I, Università degli Studi di Bergamo, Università degli studi di Bergamo (UniBG), Università degli Studi di Milano-Bicocca [Milano] (UNIMIB), Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision (LAMSADE), Université Paris Dauphine-PSL, and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
graphs ,Settore INF/01 - Informatica ,General Computer Science ,Existential quantification ,Approximation algorithm ,INF/01 - INFORMATICA ,0102 computer and information sciences ,02 engineering and technology ,01 natural sciences ,Computer Science Applications ,Theoretical Computer Science ,Combinatorics ,Computational Theory and Mathematics ,010201 computation theory & mathematics ,Network mining ,0202 electrical engineering, electronic engineering, information engineering ,Graph (abstract data type) ,Clubs, graphn covering ,[INFO]Computer Science [cs] ,020201 artificial intelligence & image processing ,Cover (algebra) ,Geometry and Topology ,Clique cover ,Mathematics - Abstract
International audience; Finding cohesive subgraphs in a network has been investigated in many network mining applications. Several alternative formulations of cohesive subgraph have been proposed, a notable one of them is s-club, which is a subgraph whose diameter is at most s. Here we consider a natural variant of the well-known Minimum Clique Cover problem, where we aim to cover a given graph with the minimum number of s-clubs, instead of cliques. We study the computational and approximation complexity of this problem, when s is equal to 2 or 3. We show that deciding if there exists a cover of a graph with three 2-clubs is NP-complete, and that deciding if there exists a cover of a graph with two 3-clubs is NP-complete. Then, we consider the approximation complexity of covering a graph with the minimum number of 2-clubs and 3-clubs. We show that, given a graph G=(V,E) to be covered, covering G with the minimum number of 2-clubs is not approximable within factor O(|V|1/2−ε), for any ε>0, and covering G with the minimum number of 3-clubs is not approximable within factor O(|V|1−ε), for any ε>0. On the positive side, we give an approximation algorithm of factor 2|V|1/2log3/2|V| for covering a graph with the minimum number of 2-clubs.
- Published
- 2019
31. Kernel Machines: Introduction
- Author
-
Riccardo Dondi, Giancarlo Mauri, Italo Zoppis, Ranganathan, S, Gribskov, M, Nakai, K, Schönbach, C (Editors in Chief), Zoppis, I, Mauri, G, and Dondi, R
- Subjects
Inner product representation ,Kernel algorithms ,Kernel function ,Property (philosophy) ,Theoretical computer science ,Settore INF/01 - Informatica ,Notice ,Computer science ,INF/01 - INFORMATICA ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,Kernel algorithm ,Kernel (image processing) ,Similarity (psychology) ,Representation (mathematics) - Abstract
Kernels are a type of similarity measures between observed patterns. By exploiting an important mathematical property, they provide new pattern representation and, at the same time, new perspectives to solve many machine learning problems. In this article, we will describe and motivate the main idea of the kernel approach. Notice that, while the usage of kernels is based on well founded theoretical arguments, we will confine our discussion to the main intuitive functional concepts and definitions.
- Published
- 2019
32. Optimized Social Explanation for Educational Platforms
- Author
-
Italo Zoppis, Luca Marconi, Riccardo Dondi, Giancarlo Mauri, Sara Manzoni, Francesco Epifania, Lane H.,Zvacek S.,Uhomoibhi J., Epifania, F, Marconi, L, Mauri, G, Manzoni, S, Dondi, R, and Zoppis, I
- Subjects
Optimized Social Explanation ,Knowledge management ,Settore INF/01 - Informatica ,Computer science ,business.industry ,Genetic Algorithms ,Social Networks ,Communities Identification ,WhoTeach ,Recommender Systems ,INF/01 - INFORMATICA ,Work in process ,Recommender system ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,Test (assessment) ,Knowledge extraction ,Order (business) ,business - Abstract
Recommender Systems have became extremely appealing for all technology enhanced learning researches aimed to design, develop and test technical innovations which support and enhance learning and teaching practices of both individuals and organizations. In this scenario a new emerging paradigm of explainable Recommander Systems leverages social friend information to provide (social) explanations in order to supply users with his/her friends’ public interests as explained recommendation. In this paper we introduce our educational platform called “WhoTeach”, an innovative and original system to integrate knowledge discovery, social networks analysis, and educational services. In particular, we report here our work in progress for providing “WhoTeach” environment with optimized Social Explainable Recommandations oriented to design new teachers’ programmes and courses.
- Published
- 2019
33. Graph Algorithms
- Author
-
Riccardo Dondi, Giancarlo Mauri, Italo Zoppis, Ranganathan, S, Gribskov, M, Nakai, K, Schönbach, C (Editors in Chief), Dondi, R, Mauri, G, and Zoppis, I
- Subjects
Travelling salesman problem ,Graph algorithms, shortest path, maximum flow, minimum spanning tree, travelling salesman problem ,Shortest path ,Settore INF/01 - Informatica ,Maximum flow ,INF/01 - INFORMATICA ,Graph algorithms ,Minimum spanning tree ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI - Abstract
In this contribution, we describe the main graph algorithms that are applied in several fields, from transport network to computational biology. First, we will review two-well known traversal algorithms (depth-first search and breadth-first search). Then we consider the problem of computing a shortest path between two vertices and we review the Dijkstra's algorithm, the Bellman-Ford algorithm and Floyd-Warshall algorithm. Another fundamental graph problem we will deal with is that of computing a maximum flow in graph and we will review the Ford-Fulkerson algorithm for the computation of a maximum flow. Next, we will consider consider the computation of a minimum spanning tree of a given graph, and we will review two well-know algorithms for solving this problem: Kruskal's algorithm and Prim's algorithm. Finally, we will consider the traveling salesman problem and the nearest neighbour algorithm, which is a heuristic for this problem.
- Published
- 2019
34. Top k 2-clubs in a network: A genetic algorithm
- Author
-
Riccardo Dondi, Italo Zoppis, Giancarlo Mauri, Mauro Castelli, Sara Manzoni, Castelli, M, Dondi, R, Manzoni, S, Mauri, G, and Zoppis, I
- Subjects
Random graph ,Theoretical computer science ,Settore INF/01 - Informatica ,Heuristic (computer science) ,Computer science ,2-club maximization ,Community optimization ,Genetic algorithms ,Graphs ,INF/01 - INFORMATICA ,Graph ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,Set (abstract data type) ,Identification (information) ,Genetic algorithm ,Graph (abstract data type) ,Biological network - Abstract
The identification of cohesive communities (dense sub-graphs) is a typical task applied to the analysis of social and biological networks. Different definitions of communities have been adopted for particular occurrences. One of these, the 2-club (dense subgraphs with diameter value at most of length 2) has been revealed of interest for applications and theoretical studies. Unfortunately, the identification of 2-clubs is a computationally intractable problem, and the search of approximate solutions (at a reasonable time) is therefore fundamental in many practical areas. In this article, we present a genetic algorithm based heuristic to compute a collection of Top k 2-clubs, i.e., a set composed by the largest k 2-clubs which cover an input graph. In particular, we discuss some preliminary results for synthetic data obtained by sampling Erdös-Rényi random graphs.
- Published
- 2019
35. Correcting Gene Trees by Leaf Insertions: Complexity and Approximation
- Author
-
Stefano Beretta, Riccardo Dondi, Beretta, S, and Dondi, R
- Subjects
0301 basic medicine ,K-ary tree ,Computational Complexity ,General Computer Science ,0206 medical engineering ,Gene Tree-Species Tree Reconciliation ,02 engineering and technology ,Algorithms ,Computational biology ,Gene tree corrections ,Phylogenomics ,Theoretical Computer Science ,Computer Science ,Combinatorics ,03 medical and health sciences ,Phylogenomic ,Mathematics ,Multiset ,Settore INF/01 - Informatica ,Computer Science (all) ,Gene tree correction ,Approximation algorithm ,Search tree ,Algorithm ,Tree (data structure) ,Tree traversal ,ComputingMethodologies_PATTERNRECOGNITION ,030104 developmental biology ,Tree rearrangement ,020602 bioinformatics ,Order statistic tree ,Computer Science(all) - Abstract
Gene tree correction has recently gained interest in phylogenomics, as it gives insights in understanding the evolution of gene families. Following some recent approaches based on leaf edit operations, we consider a variant of the problem where a gene tree is corrected by inserting leaves with labels in a multiset M. We show that the problem of deciding whether a gene tree can be corrected by inserting leaves with labels in M is NP-complete. Then, we consider an optimization variant of the problem that asks for the correction of a gene tree with leaves labeled by a multiset M ′ , with M ′ ⊇ M , having minimum size. For this optimization variant of the problem, we present a factor 2 approximation algorithm.
- Published
- 2016
- Full Text
- View/download PDF
36. Covering with clubs: Complexity and approximability
- Author
-
Giancarlo Mauri, Florian Sikora, Italo Zoppis, Riccardo Dondi, Università degli Studi di Bergamo, Università degli studi di Bergamo (UniBG), Università degli Studi di Milano-Bicocca [Milano] (UNIMIB), Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision (LAMSADE), Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS), Iliopoulus, C, Leong, HW, Sung WK, Dondi, R, Mauri, G, Sikora, F, and Zoppis, I
- Subjects
Vertex (graph theory) ,graph algorithms ,FOS: Computer and information sciences ,0211 other engineering and technologies ,0102 computer and information sciences ,02 engineering and technology ,01 natural sciences ,Theoretical Computer Science ,Combinatorics ,Computer Science - Data Structures and Algorithms ,[INFO]Computer Science [cs] ,Data Structures and Algorithms (cs.DS) ,Graph algorithms ,approximation algorithms ,Mathematics ,021103 operations research ,Settore INF/01 - Informatica ,INF/01 - INFORMATICA ,Approximation algorithm ,Graph theory ,approximation complexity, cohesive subgraphs, graph covering ,s-Clubs ,010201 computation theory & mathematics ,Computer Science ,Algorithms ,Combinatorial optimization ,combinatorial optimization - Abstract
Finding cohesive subgraphs in a network is a well-known problem in graph theory. Several alternative formulations of cohesive subgraph have been proposed, a notable example being $s$-club, which is a subgraph where each vertex is at distance at most $s$ to the others. Here we consider the problem of covering a given graph with the minimum number of $s$-clubs. We study the computational and approximation complexity of this problem, when $s$ is equal to 2 or 3. First, we show that deciding if there exists a cover of a graph with three $2$-clubs is NP-complete, and that deciding if there exists a cover of a graph with two $3$-clubs is NP-complete. Then, we consider the approximation complexity of covering a graph with the minimum number of $2$-clubs and $3$-clubs. We show that, given a graph $G=(V,E)$ to be covered, covering $G$ with the minimum number of $2$-clubs is not approximable within factor $O(|V|^{1/2 -\varepsilon})$, for any $\varepsilon>0$, and covering $G$ with the minimum number of $3$-clubs is not approximable within factor $O(|V|^{1 -\varepsilon})$, for any $\varepsilon>0$. On the positive side, we give an approximation algorithm of factor $2|V|^{1/2}\log^{3/2} |V|$ for covering a graph with the minimum number of $2$-clubs., Accepted in IWOCA 2018
- Published
- 2018
37. Distributed Heuristics for Optimizing Cohesive Groups: A Support for Clinical Patient Engagement in Social Network Analysis
- Author
-
Giancarlo Mauri, Alessandro Beltramo, Davide Coppetti, Riccardo Dondi, Italo Zoppis, Zoppis, I, Dondi, R, Coppetti, D, Beltramo, A, and Mauri, G
- Subjects
Optimization problem ,Knowledge management ,020205 medical informatics ,Computer science ,Computer Networks and Communications ,0102 computer and information sciences ,02 engineering and technology ,Distributed Heuristic ,01 natural sciences ,Dense Communities ,Argument ,Health care ,Genetic algorithm ,0202 electrical engineering, electronic engineering, information engineering ,Decision-making ,Social network analysis ,2-club optimization problem ,Genetic Algorithm ,Hardware and Architecture ,Settore INF/01 - Informatica ,business.industry ,INF/01 - INFORMATICA ,Social relation ,Computer Networks and Communication ,010201 computation theory & mathematics ,Dense Communitie ,business ,Heuristics - Abstract
Social interaction allows to support the disease management by creating online spaces where patients can interact with clinicians, and share experiences with other patients. Therefore, promoting targeted communication in online social spaces is a means to group patients around shared goals, offer emotional support, and finally engage patients in their healthcare decision making process. In this paper, we approach the argument from a theoretical perspective: we design an optimization problem aimed to encourage the creation of (induced) sub-networks of patients which, being recently diagnosed, wish to deepen the knowledge about their medical treatment with some other similar profiled patients, which have already been followed up by specific (even alternative) care centers. In particular, due to the computational hardness of the proposed problem, we provide approximated solutions based on distributed heuristics (i.e., Genetic Algorithms). Results are given for simulated data using Erdos-Renyi random graphs.
- Published
- 2018
38. 8th Workshop on Biomedical and Bioinformatics Challenges for Computer Science – BBC2015
- Author
-
Mauro Castelli, Stefano Beretta, Mario Cannataro, Beretta, S, Cannataro, M, and Dondi, R
- Subjects
Computer science ,Computer Science (all) ,0202 electrical engineering, electronic engineering, information engineering ,General Earth and Planetary Sciences ,020206 networking & telecommunications ,020201 artificial intelligence & image processing ,02 engineering and technology ,Bioinformatics ,General Environmental Science - Abstract
Agapito, G., Cannataro, M., Castelli, M., Dondi, R., & Zoppis, I. (2017). 10th Workshop on Biomedical and Bioinformatics Challenges for Computer Science - BBC2017. Procedia Computer Science, 108, 1113-1114. https://doi.org/10.1016/j.procs.2017.05.279
- Published
- 2015
- Full Text
- View/download PDF
39. HapCol: Accurate and Memory-efficient Haplotype Assembly from Long Reads
- Author
-
Riccardo Dondi, Yuri Pirola, Simone Zaccaria, Gunnar W. Klau, Nadia Pisanti, Paola Bonizzoni, Evolutionary Intelligence, Econometrics and Operations Research, Bioinformatics, Integrative Bioinformatics, AIMMS, Dipartimento di Informatica Sistemistica e Comunicazione (DISCo), Università degli Studi di Milano-Bicocca [Milano] (UNIMIB), Dipartimento di Scienze dei Linguaggi, della Comunicazione e degli Studi Culturali, Università degli studi di Bergamo (UniBG), Life Sciences [Amsterdam] (MAC4), Centrum Wiskunde & Informatica (CWI), Equipe de recherche européenne en algorithmique et biologie formelle et expérimentale (ERABLE), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Dipartimento di Informatica [Pisa] (DI), University of Pisa - Università di Pisa, Università degli Studi di Milano-Bicocca = University of Milano-Bicocca (UNIMIB), Università degli Studi di Bergamo = University of Bergamo (UniBg), Pirola, Y, Zaccaria, S, Dondi, R, Klau, G, Pisanti, N, and Bonizzoni, P
- Subjects
0301 basic medicine ,Statistics and Probability ,Computer science ,0206 medical engineering ,02 engineering and technology ,computer.software_genre ,Polymorphism, Single Nucleotide ,Biochemistry ,03 medical and health sciences ,Computational Theory and Mathematic ,Molecular Biology ,ComputingMilieux_MISCELLANEOUS ,Settore INF/01 - Informatica ,Medicine (all) ,Haplotype ,Computer Science Applications1707 Computer Vision and Pattern Recognition ,Sequence Analysis, DNA ,Computational Theory and Mathematics ,Computational Mathematics ,Diploidy ,Computer Science Applications ,030104 developmental biology ,Exact algorithm ,Haplotypes ,Computational Mathematic ,Data mining ,Ploidy ,Computational problem ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,computer ,020602 bioinformatics ,Algorithms ,Software - Abstract
Motivation: Haplotype assembly is the computational problem of reconstructing haplotypes in diploid organisms and is of fundamental importance for characterizing the effects of single-nucleotide polymorphisms on the expression of phenotypic traits. Haplotype assembly highly benefits from the advent of ‘future-generation’ sequencing technologies and their capability to produce long reads at increasing coverage. Existing methods are not able to deal with such data in a fully satisfactory way, either because accuracy or performances degrade as read length and sequencing coverage increase or because they are based on restrictive assumptions. Results: By exploiting a feature of future-generation technologies—the uniform distribution of sequencing errors—we designed an exact algorithm, called HapCol, that is exponential in the maximum number of corrections for each single-nucleotide polymorphism position and that minimizes the overall error-correction score. We performed an experimental analysis, comparing HapCol with the current state-of-the-art combinatorial methods both on real and simulated data. On a standard benchmark of real data, we show that HapCol is competitive with state-of-the-art methods, improving the accuracy and the number of phased positions. Furthermore, experiments on realistically simulated datasets revealed that HapCol requires significantly less computing resources, especially memory. Thanks to its computational efficiency, HapCol can overcome the limits of previous approaches, allowing to phase datasets with higher coverage and without the traditional all-heterozygous assumption. Availability and implementation: Our source code is available under the terms of the GNU General Public License at http://hapcol.algolab.eu/. Contact: bonizzoni@disco.unimib.it Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2016
40. Clique editing to support case versus control discrimination
- Author
-
Riccardo Dondi, Italo Zoppis, Giancarlo Mauri, Jain, LC, Howlett, RJ, Czarnowski, I, Caballero, AM, Dondi, R, Mauri, G, and Zoppis, I
- Subjects
Vertex (graph theory) ,Loop (graph theory) ,Mathematical optimization ,graph partitioning ,Computer science ,Vertex cover ,0102 computer and information sciences ,Clique (graph theory) ,Clique graph ,01 natural sciences ,Edge cover ,Simplex graph ,Combinatorics ,010104 statistics & probability ,Dominating set ,Perfect graph ,genetic algorithm ,Partition (number theory) ,Split graph ,0101 mathematics ,Block graph ,Clique ,Decision Sciences (all) ,Computer Science (all) ,Settore INF/01 - Informatica ,INF/01 - INFORMATICA ,Feedback arc set ,Graph ,Vertex (geometry) ,Treewidth ,Graph bandwidth ,010201 computation theory & mathematics ,Independent set ,Triangle-free graph ,Feedback vertex set ,Maximal independent set ,Fractional coloring ,Moral graph ,MathematicsofComputing_DISCRETEMATHEMATICS - Abstract
We present a graph-based approach to support case vs control discrimination problems. The goal is to partition a given input graph in two sets, a clique and an independent set, such that there is no edge connecting a vertex of the clique with a vertex of the independent set. Following a parsimonious principle, we consider the problem that aims to modify the input graph into a most similar output graph that consists of a clique and an independent set (with no edge between the two sets). First, we present a theoretical result showing that the problem admits a polynomial-time approximation scheme. Then, motivated by the complexity of such an algorithm, we propose a genetic algorithm and we present an experimental analysis on simulated data.
- Published
- 2016
41. On the Minimum Error Correction Problem for Haplotype Assembly in Diploid and Polyploid Genomes
- Author
-
Riccardo Dondi, Paola Bonizzoni, Gunnar W. Klau, Simone Zaccaria, Nadia Pisanti, Yuri Pirola, Dipartimento di Informatica Sistemistica e Comunicazione (DISCo), Università degli Studi di Milano-Bicocca = University of Milano-Bicocca (UNIMIB), Dipartimento di Scienze dei Linguaggi, della Comunicazione e degli Studi Culturali, Università degli Studi di Bergamo = University of Bergamo (UniBg), Equipe de recherche européenne en algorithmique et biologie formelle et expérimentale (ERABLE), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Life Sciences [Amsterdam] (MAC4), Centrum Wiskunde & Informatica (CWI), Dipartimento di Informatica [Pisa] (DI), University of Pisa - Università di Pisa, Econometrics and Operations Research, Bioinformatics, Integrative Bioinformatics, AIMMS, Bonizzoni, P, Dondi, R, Klau, G, Pirola, Y, Pisanti, N, Zaccaria, S, Università degli Studi di Milano-Bicocca [Milano] (UNIMIB), and Università degli studi di Bergamo (UniBG)
- Subjects
0301 basic medicine ,haplotype ,haplotypes ,Computational complexity theory ,graph theory ,[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS] ,Parameterized complexity ,0102 computer and information sciences ,Biology ,Polymorphism, Single Nucleotide ,01 natural sciences ,Polyploidy ,03 medical and health sciences ,Fragment (logic) ,Genetic ,Computational Theory and Mathematic ,Genetics ,Humans ,Cluster analysis ,Molecular Biology ,combinatorial optimization, graph theory, haplotypes, next-generation sequencing ,Settore INF/01 - Informatica ,Models, Genetic ,Genome, Human ,combinatorial optimization ,next-generation sequencing ,Modeling and Simulation ,Computational Theory and Mathematics ,Computational Mathematics ,Computational mathematics ,Graph theory ,Sequence Analysis, DNA ,Diploidy ,030104 developmental biology ,010201 computation theory & mathematics ,Combinatorial optimization ,Computational problem ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,Algorithm ,Algorithms - Abstract
In diploid genomes, haplotype assembly is the computational problem of reconstructing the two parental copies, called haplotypes, of each chromosome starting from sequencing reads, called fragments, possibly affected by sequencing errors. Minimum error correction (MEC) is a prominent computational problem for haplotype assembly and, given a set of fragments, aims at reconstructing the two haplotypes by applying the minimum number of base corrections. MEC is computationally hard to solve, but some approximation-based or fixed-parameter approaches have been proved capable of obtaining accurate results on real data. In this work, we expand the current characterization of the computational complexity of MEC from the approximation and the fixed-parameter tractability point of view. In particular, we show that MEC is not approximable within a constant factor, whereas it is approximable within a logarithmic factor in the size of the input. Furthermore, we answer open questions on the fixed-parameter tractability for parameters of classical or practical interest: the total number of corrections and the fragment length. In addition, we present a direct 2-approximation algorithm for a variant of the problem that has also been applied in the framework of clustering data. Finally, since polyploid genomes, such as those of plants and fishes, are composed of more than two copies of the chromosomes, we introduce a novel formulation of MEC, namely the k-ploid MEC problem, that extends the traditional problem to deal with polyploid genomes. We show that the novel formulation is still both computationally hard and hard to approximate. Nonetheless, from the parameterized point of view, we prove that the problem is tractable for parameters of practical interest such as the number of haplotypes and the coverage, or the number of haplotypes and the fragment length.
- Published
- 2016
42. On the Approximation of Correlation Clustering and Consensus Clustering
- Author
-
Riccardo Dondi, Tao Jiang, Paola Bonizzoni, Gianluca Della Vedova, Bonizzoni, P, DELLA VEDOVA, G, Dondi, R, and Jiang, T
- Subjects
Fuzzy clustering ,Computer Networks and Communications ,Single-linkage clustering ,Correlation clustering ,0102 computer and information sciences ,02 engineering and technology ,01 natural sciences ,Complete-linkage clustering ,Theoretical Computer Science ,APX-hardness ,Combinatorics ,CURE data clustering algorithm ,Consensus clustering ,0202 electrical engineering, electronic engineering, information engineering ,Cluster analysis ,Approximation ,k-medians clustering ,Mathematics ,consensus clustering ,Applied Mathematics ,INF/01 - INFORMATICA ,correlation clustering ,Computational Theory and Mathematics ,010201 computation theory & mathematics ,ptas ,020201 artificial intelligence & image processing - Abstract
The Correlation Clustering problem has been introduced recently [N. Bansal, A. Blum, S. Chawla, Correlation Clustering, in: Proc. 43rd Symp. Foundations of Computer Science, FOCS, 2002, pp. 238-247] as a model for clustering data when a binary relationship between data points is known. More precisely, for each pair of points we have two scores measuring the similarity and dissimilarity respectively, of the two points, and we would like to compute an optimal partition where the value of a partition is obtained by summing up the similarity scores of pairs involving points from the same cluster and the dissimilarity scores of pairs involving points from different clusters. A closely related problem is Consensus Clustering, where we are given a set of partitions and we would like to obtain a partition that best summarizes the input partitions. The latter problem is a restricted case of Correlation Clustering. In this paper we prove that Minimum Consensus Clustering is APX-hard even for three input partitions, answering an open question in the literature, while Maximum Consensus Clustering admits a PTAS. We exhibit a combinatorial and practical frac(4, 5)-approximation algorithm based on a greedy technique for Maximum Consensus Clustering on three partitions. Moreover, we prove that a PTAS exists for Maximum Correlation Clustering when the maximum ratio between two scores is at most a constant. © 2007 Elsevier Inc. All rights reserved.
- Published
- 2008
43. Fixed-parameter algorithms for scaffold filling
- Author
-
Anna Paola Carrieri, Riccardo Dondi, Laurent Bulteau, Department of Software Engineering and Theoretical Computer Science (SWT - TUB), Technische Universität Berlin (TU), Dipartimento di Informatica Sistemistica e Comunicazione (DISCo), Università degli Studi di Milano-Bicocca [Milano] (UNIMIB), Dipartimento di Scienze dei Linguaggi, della Comunicazione e degli Studi Culturali, Università degli studi di Bergamo (UniBG), Bulteau, L, Carrieri, A, and Dondi, R
- Subjects
Scaffold ,General Computer Science ,Computer science ,Open problem ,0206 medical engineering ,Sequencing data ,[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS] ,Parameterized complexity ,0102 computer and information sciences ,02 engineering and technology ,Fixed-parameter algorithms ,01 natural sciences ,Genome ,Theoretical Computer Science ,Computational biology ,Gene ,Filling Problem ,Scaffold filling ,Genome comparison ,Settore INF/01 - Informatica ,Computer Science (all) ,010201 computation theory & mathematics ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,Algorithm ,020602 bioinformatics - Abstract
International audience; The new sequencing technologies, called next-generation se-quencing, provide a huge amount of data that can be used to reconstruct genomes. However, the methods applied to reconstruct genomes often are not able to reconstruct a complete genome and provide only an incomplete information. Here we consider two combinatorial problems that aim to reconstruct complete genomes by inserting a collection of missing genes. The first problem we consider, called One-sided scaffold filling, given an incomplete genome B and a complete genome A, asks for the insertion of missing genes into an incomplete genome B with the goal of maximizing the common adjacencies between genomes B (resulting from the insertion of missing genes in B) and A. The second problem, called Two-sided scaffold filling, given two incomplete genomes A, B, asks for the insertion of missing genes into both genomes so that the resulting genomes A and B have the same multiset of genes and the number of common adjacencies between A and B is maximized. Both problems were proved to be NP-hard, while their parameterized complexity, when the parameter is the number of common adjacencies of the resulting genomes, was left as an open problem. In this paper, we settle this open problem by presenting fixed-parameter algorithms for One-sided scaffold filling and Two-sided scaffold filling.
- Published
- 2015
44. On the fixed parameter tractability and approximability of the minimum error correction problem
- Author
-
Bonizzoni, Paola, Dondi, Riccardo, Klau, Gunnar, Pirola, Yuri, Pisanti, Nadia, Zaccaria, Simone, Cicalese, F., Porat, E., Vaccaro, U., Evolutionary Intelligence, Dipartimento di Informatica Sistemistica e Comunicazione (DISCo), Università degli Studi di Milano-Bicocca = University of Milano-Bicocca (UNIMIB), Dipartimento di Scienze dei Linguaggi, della Comunicazione e degli Studi Culturali, Università degli Studi di Bergamo = University of Bergamo (UniBg), Equipe de recherche européenne en algorithmique et biologie formelle et expérimentale (ERABLE), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Life Sciences [Amsterdam] (MAC4), Centrum Wiskunde & Informatica (CWI), Dipartimento di Informatica [Pisa] (DI), University of Pisa - Università di Pisa, Econometrics and Operations Research, Bioinformatics, AIMMS, Amsterdam Business Research Institute, Integrative Bioinformatics, Università degli Studi di Milano-Bicocca [Milano] (UNIMIB), Università degli studi di Bergamo (UniBG), Bonizzoni, P, Dondi, R, Klau, G, Pirola, Y, Pisanti, N, and Zaccaria, S
- Subjects
0303 health sciences ,algorithm ,computational complexity ,Settore INF/01 - Informatica ,0206 medical engineering ,Haplotype ,[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS] ,INF/01 - INFORMATICA ,haplotype assembly ,02 engineering and technology ,Base (topology) ,Set (abstract data type) ,Combinatorics ,03 medical and health sciences ,Chromosome (genetic algorithm) ,fixed-parameter tractability ,[INFO]Computer Science [cs] ,Minimum error correction ,Computational problem ,Algorithm ,020602 bioinformatics ,030304 developmental biology ,Mathematics - Abstract
International audience; Haplotype assembly is the computational problem of reconstructing the two parental copies, called haplotypes, of each chromosome starting from sequencing reads, called fragments, possibly affected by sequencing errors. Minimum Error Correction (MEC) is a prominent computational problem for haplotype assembly and, given a set of fragments , aims at reconstructing the two haplotypes by applying the minimum number of base corrections. By using novel combinatorial properties of MEC instances, we are able to provide new results on the fixed-parameter tractability and approx-imability of MEC. In particular, we show that MEC is in FPT when para-meterized by the number of corrections, and, on " gapless " instances, it is in FPT also when parameterized by the length of the fragments, whereas the result known in literature forces the reconstruction of complementary haplotypes. Then, we show that MEC cannot be approximated within any constant factor while it is approximable within factor O(log nm) where nm is the size of the input. Finally, we provide a practical 2-approximation algorithm for the Binary MEC, a variant of MEC that has been applied in the framework of clustering binary data.
- Published
- 2015
45. Covering pairs in directed acyclic graphs
- Author
-
Yuri Pirola, Stefano Beretta, Riccardo Dondi, Paola Bonizzoni, Niko Beerenwinkel, Dediu, A-H, Martín-Vide, C, Sierra-Rodríguez, J-L, Truthe, B, Beerenwinkel, N, Beretta, S, Bonizzoni, P, Dondi, R, and Pirola, Y
- Subjects
FOS: Computer and information sciences ,paired-end read ,Minimum path cover ,Sequence reconstruction ,Paired-end reads ,Computational complexity ,Computational complexity theory ,General Computer Science ,constraint ,0102 computer and information sciences ,Computational Complexity (cs.CC) ,01 natural sciences ,Combinatorics ,Set (abstract data type) ,03 medical and health sciences ,Cardinality ,Computer Science - Data Structures and Algorithms ,path cover ,Data Structures and Algorithms (cs.DS) ,oftware testing ,Time complexity ,Mathematics ,030304 developmental biology ,0303 health sciences ,bioinformatic ,Degree (graph theory) ,Settore INF/01 - Informatica ,DAG ,INF/01 - INFORMATICA ,Path cover ,sequence reconstruction ,paired-end reads ,computational complexity ,Directed acyclic graph ,Computer Science - Computational Complexity ,010201 computation theory & mathematics ,Path (graph theory) ,MathematicsofComputing_DISCRETEMATHEMATICS - Abstract
The Computer Journal, 58 (7), ISSN:0010-4620, ISSN:1460-2067
- Published
- 2015
- Full Text
- View/download PDF
46. Restricted and Swap Common Superstring: A Multivariate Algorithmic Perspective
- Author
-
Giancarlo Mauri, Riccardo Dondi, Paola Bonizzoni, Italo Zoppis, Bonizzoni, P, Dondi, R, Mauri, G, and Zoppis, I
- Subjects
Discrete mathematics ,String comparison ,Multiset ,Settore INF/01 - Informatica ,General Computer Science ,Applied Mathematics ,APX-hardne ,Superstring theory ,Parameterized complexity ,INF/01 - INFORMATICA ,Algorithms ,APX-hardness ,Shortest common superstring ,Substring ,Computer Science Applications ,Combinatorics ,Algorithm ,High Energy Physics::Theory ,Polynomial kernel ,Bounded function ,Theory of computation ,Swap (computer programming) ,Mathematics - Abstract
In several areas, for example in bioinformatics and in AI planning, the Shortest Common Superstring problem (SCS) and variants thereof have been successfully applied for string comparison. In this paper we consider two variants of SCS recently introduced, namely Restricted Common Superstring (RCS) and Swapped Common Superstring (SRCS). In RCS we are given a set $$S$$S of strings and a multiset $$\mathcal {M}$$M of symbols, and we look for an ordering $$\mathcal {M}_o$$Mo of $$\mathcal {M}$$M such that the number of input strings which are substrings of $$\mathcal {M}_o$$Mo is maximized. In SRCS we are given a set $$S$$S of strings and a text $$\mathcal {T}$$T, and we look for a swap ordering $$\mathcal {T}_o$$To of $$\mathcal {T}$$T (an ordering of $$\mathcal {T}$$T obtained by swapping only some pairs of adjacent symbols) such that the number of input strings which are substrings of $$\mathcal {T}_o$$To is maximized. In this paper we propose a multivariate algorithmic analysis of the complexity of the two problems, aiming at determining how different parameters influence the complexity of the two problems. We consider as interesting parameters the size of the solutions (that is the number of input strings contained in the computed superstring), the maximum length of the given input strings, the size of the alphabet over which the input strings range. First, we give two fixed-parameter algorithms, where the parameter is the size of the solution, for SRCS and lRCS (the RCS problem restricted to strings of length bounded by a parameter $$\ell $$l). Furthermore, we complement these results by showing that SRCS and lRCS do not admit a polynomial kernel unless $$NP \subseteq coNP/Poly$$NP⊆coNP/Poly. Then, we show that SRCS is APX-hard even when the input strings have length bounded by a constant (equal to $$10$$10) or are over a binary alphabet.
- Published
- 2015
47. A clustering algorithm for planning the integration process of a large number of conceptual schemas
- Author
-
Paola Bonizzoni, Francesco Salandra, Yuri Pirola, Marco Comerio, Riccardo Dondi, Carlo Batini, Batini, C, Bonizzoni, P, Comerio, M, Dondi, R, Pirola, Y, and Salandra, F
- Subjects
Conceptual schema ,schema integration ,Settore INF/01 - Informatica ,Computer science ,Correlation clustering ,INF/01 - INFORMATICA ,computer.software_genre ,ING-INF/05 - SISTEMI DI ELABORAZIONE DELLE INFORMAZIONI ,Computer Science Applications ,Theoretical Computer Science ,Computational Theory and Mathematics ,Hardware and Architecture ,Schema (psychology) ,Theory of computation ,Data mining ,Cluster analysis ,Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni ,computer ,Computer Science::Databases ,Software ,clustering - Abstract
When tens and even hundreds of schemas are involved in the integration process, criteria are needed for choosing clusters of schemas to be integrated, so as to deal with the integration problem through an efficient iterative process. Schemas in clusters should be chosen according to cohesion and coupling criteria that are based on similarities and dissimilarities among schemas. In this paper, we propose an algorithm for a novel variant of the correlation clustering approach that addresses the problem of assisting a designer in integrating a large number of conceptual schemas. The novel variant introduces upper and lower bounds to the number of schemas in each cluster, in order to avoid too complex and too simple integration contexts respectively. We give a heuristic for solving the problem, being an NP hard combinatorial problem. An experimental activity demonstrates an appreciable increment in the effectiveness of the schema integration process when clusters are computed by means of the proposed algorithm w.r.t. the ones manually defined by an expert.
- Published
- 2015
48. Robust conclusions in mass spectrometry analysis
- Author
-
Clizia Chinello, Giancarlo Mauri, Riccardo Dondi, Erica Gianazza, Fulvio Magni, Massimiliano Borsani, Italo Zoppis, Zoppis, I, Dondi, R, Borsani, M, Gianazza, E, Chinello, C, Magni, F, and Mauri, G
- Subjects
Computer science ,Data analysis ,Inference ,Machine learning ,computer.software_genre ,Mass spectrometry ,Graph ,Robustness (computer science) ,Graph property ,General Environmental Science ,Random graph ,Biological data ,Settore INF/01 - Informatica ,business.industry ,Computer Science (all) ,Robust decision ,Data analysi ,Robust decisions ,General Earth and Planetary Sciences ,Data mining ,Artificial intelligence ,business ,computer - Abstract
A central issue in biological data analysis is that uncertainty, resulting from different factors of variability, may change the effect of the events being investigated. Therefore, robustness is a fundamental step to be considered. Robustness refers to the ability of a process to cope well with uncertainties, but the different ways to model both the processes and the uncertainties lead to many alternative conclusions in the robustness analysis. In this paper we apply a framework allowing to deal with such questions for mass spectrometry data. Specifically, we provide robust decisions when testing hypothesis over a case/control population of subject measurements (i.e. proteomic profiles). To this concern, we formulate (i) a reference model for the observed data (i.e., graphs), (ii) a reference method to provide decisions (i.e., test of hypotheses over graph properties) and (iii) a reference model of variability to employ sources of uncertainties (i.e., random graphs). We apply these models to a realcase study, analyzing the mass spectrometry profiles of the most common type of Renal Cell Carcinoma; the Clear Cell variant.
- Published
- 2015
49. Correcting gene tree by removal and modification: Tractability and approximability
- Author
-
Stefano Beretta, Riccardo Dondi, Mauro Castelli, Beretta, S, Castelli, M, and Dondi, R
- Subjects
Discrete mathematics ,K-ary tree ,Approximation complexity ,Settore INF/01 - Informatica ,Gene tree ,Gene tree reconciliation ,Computational biology ,Gene tree correction ,Parameterized complexity ,Theoretical Computer Science ,Combinatorics ,Tree (data structure) ,Computational Theory and Mathematics ,Computational Theory and Mathematic ,Discrete Mathematics and Combinatorics ,Order (group theory) ,Gomory–Hu tree ,Focus (optics) ,Constant (mathematics) ,Discrete Mathematics and Combinatoric ,Mathematics - Abstract
Gene tree correction with respect to a given species tree is a problem that has been recently proposed in order to better understand the evolution of gene families. One of the combinatorial methods proposed to tackle with this problem aims to correct a gene tree by removing the minimum number of leaves/labels (Minimum Leaf Removal and Minimum Label Removal, respectively). The two problems have been shown to be APX-hard, and fixed-parameter tractable, when parameterized by the number of leaves/labels removed. In this paper, we focus on the approximation complexity of these two problems and we show that they are not approximable within factor b log ? m , where m is the number of leaves of the species tree and b 0 is a constant. Furthermore, we introduce and study two new variants of the problem, where the goal is the correction of a gene tree with the minimum number of leaf/label modifications (Minimum Leaf Modification and Minimum Label Modification, respectively). We show that the two modification problems, differently from the removal versions, are unlikely to be fixed-parameter tractable. More precisely, we prove that the Minimum Leaf Modification problem is W 1 ] -hard, when parameterized by the number of leaf modifications, and that the Minimum Label Modification problem is W 2 ] -hard, when parameterized by the number of label modifications.
- Published
- 2015
50. Reconciling a gene tree to a species tree under the duplication cost model
- Author
-
Paola Bonizzoni, Gianluca Della Vedova, Riccardo Dondi, Bonizzoni, P, DELLA VEDOVA, G, and Dondi, R
- Subjects
General Computer Science ,Phylogenetic tree ,Weight-balanced tree ,INF/01 - INFORMATICA ,Tree (graph theory) ,Theoretical Computer Science ,Computational biology ,Combinatorics ,Exact algorithm ,Phylogenomics ,Tree rearrangement ,Gene duplication ,NP-hardness ,Evolutionary trees ,Gene family ,gene trees, gene duplications ,Mathematics ,Computer Science(all) - Abstract
The general problem of reconciling the information from evolutionary trees representing the relationships between distinct gene families is of great importance in bioinformatics and has been popularized among the computer science researchers by Ma et al. [From gene trees to species trees, SIAM J. Comput. 30(3) (2000) 729–752] where the authors pose the intriguing question if a certain definition of minimum tree that reconciles a gene tree and a species tree is correct. We answer affirmatively to this question; moreover, we show an efficient algorithm for computing such minimum-leaf reconciliation trees and prove the uniqueness of such trees. We then tackle some different versions of the biological problem by showing that the exemplar problem, arising from the exemplar analysis of multigene genomes, is NP-hard even when the number of copies of a given label is at most two. Finally, we introduce two novel formulations for the problem of recombining evolutionary trees, extending the gene duplication problem studied in [Ma et al., From gene trees to species trees, SIAM J. Comput. 30(3) (2000) 729–752; M. Fellows et al., On the multiple gene duplication problem, in: Proc. Ninth Internat. Symp. on Algorithms and Computation (ISAAC98), 1998; R. Page, Maps between trees and cladistic analysis of historical associations among genes, Systematic Biology 43 (1994) 58–77; R.M. Page, J. Cotton, Vertebrate phylogenomics: reconciled trees and gene duplications, in: Proc. Pacific Symp. on Biocomputing 2002 (PSB2002), 2002, pp. 536–547; R. Guigò et al., Reconstruction of ancient molecular phylogeny, Mol. Phy. and Evol. 6(2) (1996) 189–213], and we give an exact algorithm (via dynamic programming) for one of these formulations.
- Published
- 2005
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.