Author: "Stephan Steigele" / Journal: algorithms for molecular biology - Searchworks@Jio Institute Digital Library Search Results

1. On the maximal cliques in c-max-tolerance graphs and their application in clustering molecular sequences

Author: Stephan Steigele, Kay Nieselt, Michael Kaufmann, and Katharina A. Lehmann
Subjects: Biological data, Clique-sum, lcsh:QH426-470, Applied Mathematics, Research, Graph partition, Binary logarithm, Combinatorics, lcsh:Genetics, lcsh:Biology (General), Computational Theory and Mathematics, Structural Biology, Partition (number theory), K-tree, Cluster analysis, lcsh:QH301-705.5, Molecular Biology, Time complexity, Algorithm, Mathematics, MathematicsofComputing_DISCRETEMATHEMATICS
Abstract: Given a set S of n locally aligned sequences, it is a needed prerequisite to partition it into groups of very similar sequences to facilitate subsequent computations, such as the generation of a phylogenetic tree. This article introduces a new method of clustering which partitions S into subsets such that the overlap of each pair of sequences within a subset is at least a given percentage c of the lengths of the two sequences. We show that this problem can be reduced to finding all maximal cliques in a special kind of max-tolerance graph which we call a c-max-tolerance graph. Previously we have shown that finding all maximal cliques in general max-tolerance graphs can be done efficiently in O(n 3 + out). Here, using a new kind of sweep-line algorithm, we show that the restriction to c-max-tolerance graphs yields a better runtime of O(n 2 log n + out). Furthermore, we present another algorithm which is much easier to implement, and though theoretically slower than the first one, is still running in polynomial time. We then experimentally analyze the number and structure of all maximal cliques in a c-max-tolerance graph, depending on the chosen c-value. We apply our simple algorithm to artificial and biological data and we show that this implementation is much faster than the well-known application Cliquer. By introducing a new heuristic that uses the set of all maximal cliques to partition S, we finally show that the computed partition gives a reasonable clustering for biological data sets.
Published: 2006

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Stephan Steigele"'

1. On the maximal cliques in c-max-tolerance graphs and their application in clustering molecular sequences

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Database

1 results on '"Stephan Steigele"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources