Author: "Esko Ukkonen" / Language: english - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Esko Ukkonen"' showing total 31 results

Start Over Author "Esko Ukkonen" Language english

31 results on '"Esko Ukkonen"'

1. Accurate self-correction of errors in long reads using de Bruijn graphs

Author: Leena Salmela, Riku Walve, Eric Rivals, Esko Ukkonen, Department of Computer Science, Helsinki Institute for Information Technology, Finnish Centre of Excellence in Algorithmic Data Analysis Research (Algodan), Combinatorial Pattern Matching research group / Esko Ukkonen, Genome-scale Algorithmics research group / Veli Mäkinen, Bioinformatics, Algorithmic Bioinformatics, Helsinki Institute for Information Technology (HIIT), Helsingin yliopisto = Helsingfors universitet = University of Helsinki-Aalto University, Méthodes et Algorithmes pour la Bioinformatique (MAB), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Institut de Biologie Computationnelle (IBC), Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Academy of Finland (grant 267591)EU FP7 SYSCOL UE7-SYSCOL-258236, ANR-12-BS02-0008,Colib'read,Méthodes d'extraction d'information biologique dans les données HTS non assemblées(2012), ANR-11-BINF-0002,IBC,Institut de biologie Computationnelle(2011), European Project: 258236,EC:FP7:HEALTH,FP7-HEALTH-2010-two-stage,SYSCOL(2011), Aalto University, ANR-11-BINF-0002,IBC,Institut de Biologie Computationnelle de Montpellier(2011), Université de Montpellier (UM)-Institut National de la Recherche Agronomique (INRA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS), Aalto University-University of Helsinki, and Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)
Subjects: 0301 basic medicine, Statistics and Probability, assembly, Computer science, 0206 medical engineering, ACM: F.: Theory of Computation/F.2: ANALYSIS OF ALGORITHMS AND PROBLEM COMPLEXITY, education, Sequence assembly, Word error rate, non hybrid correction, Recomb-Seq/Recomb-Cbb 2016, 02 engineering and technology, Saccharomyces cerevisiae, Biochemistry, Set (abstract data type), 03 medical and health sciences, substitution, Escherichia coli, Quantitative Biology - Genomics, Molecular Biology, Self correction, Throughput (business), De Bruijn sequence, Genomics (q-bio.GN), PacBio, Genome, ACM: F.: Theory of Computation/F.2: ANALYSIS OF ALGORITHMS AND PROBLEM COMPLEXITY/F.2.2: Nonnumerical Algorithms and Problems/F.2.2.6: Sorting and searching, Sequence analysis, 1184 Genetics, developmental biology, physiology, High-Throughput Nucleotide Sequencing, Sequence Analysis, DNA, DNA, 113 Computer and information sciences, Computer Science Applications, Computational Mathematics, 030104 developmental biology, Computational Theory and Mathematics, de Bruijn, indel, FOS: Biological sciences, NGS, LoRDEC, Nanopore sequencing, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], Error detection and correction, Algorithm, 020602 bioinformatics, Algorithms, Software
Abstract: New long read sequencing technologies, like PacBio SMRT and Oxford NanoPore, can produce sequencing reads up to 50,000 bp long but with an error rate of at least 15%. Reducing the error rate is necessary for subsequent utilisation of the reads in, e.g., de novo genome assembly. The error correction problem has been tackled either by aligning the long reads against each other or by a hybrid approach that uses the more accurate short reads produced by second generation sequencing technologies to correct the long reads. We present an error correction method that uses long reads only. The method consists of two phases: first we use an iterative alignment-free correction method based on de Bruijn graphs with increasing length of k-mers, and second, the corrected reads are further polished using long-distance dependencies that are found using multiple alignments. According to our experiments the proposed method is the most accurate one relying on long reads only for read sets with high coverage. Furthermore, when the coverage of the read set is at least 75x, the throughput of the new method is at least 20% higher. LoRMA is freely available at http://www.cs.helsinki.fi/u/lmsalmel/LoRMA/., paper accepted at the RECOMB-Seq 2016
Published: 2017

2. Longest common substrings with k mismatches

Author: Emanuele Giaquinta, Tomas Flouri, Kassian Kobert, Esko Ukkonen, Department of Computer Science, Aalto-yliopisto, Aalto University, Combinatorial Pattern Matching research group / Esko Ukkonen, and Bioinformatics
Subjects: FOS: Computer and information sciences, Discrete mathematics, String algorithms, Hamming distance, Space (mathematics), 113 Computer and information sciences, Longest repeated substring problem, Substring, Combinatorial problems, Computer Science Applications, Theoretical Computer Science, Longest common substring problem, Longest common substring, Combinatorics, Computer Science - Data Structures and Algorithms, Signal Processing, Data Structures and Algorithms (cs.DS), Constant (mathematics), Information Systems, Mathematics
Abstract: The longest common substring with $k$-mismatches problem is to find, given two strings $S_1$ and $S_2$, a longest substring $A_1$ of $S_1$ and $A_2$ of $S_2$ such that the Hamming distance between $A_1$ and $A_2$ is $\le k$. We introduce a practical $O(nm)$ time and $O(1)$ space solution for this problem, where $n$ and $m$ are the lengths of $S_1$ and $S_2$, respectively. This algorithm can also be used to compute the matching statistics with $k$-mismatches of $S_1$ and $S_2$ in $O(nm)$ time and $O(m)$ space. Moreover, we also present a theoretical solution for the $k = 1$ case which runs in $O(n \log m)$ time, assuming $m\le n$, and uses $O(m)$ space, improving over the existing $O(nm)$ time and $O(m)$ space bound of Babenko and Starikovskaya., Accepted version
Published: 2015

3. Mining the VVV: star formation and embedded clusters

Author: Lauri K. Haikala, Esko Ukkonen, and Otto Solin
Subjects: Physics, Star formation, Gaussian, FOS: Physical sciences, Astronomy and Astrophysics, Astrophysics, Astrophysics::Cosmology and Extragalactic Astrophysics, Galactic plane, Mixture model, Astrophysics - Astrophysics of Galaxies, Background noise, symbols.namesake, Space and Planetary Science, Bulge, Astrophysics of Galaxies (astro-ph.GA), Expectation–maximization algorithm, symbols, Cluster (physics), Astrophysics::Solar and Stellar Astrophysics, Astrophysics::Earth and Planetary Astrophysics, Astrophysics::Galaxy Astrophysics
Abstract: The aim of this study is to locate previously unknown stellar clusters from the VISTA variables in the V\'ia L\'actea Survey (VVV) catalogue data. The method, fitting a mixture model of Gaussian densities and background noise using the expectation maximization algorithm to a pre-filtered NIR survey stellar catalogue data, was developed by the authors for the UKIDSS Galactic Plane Survey (GPS). The search located 88 previously unknown mainly embedded stellar cluster candidates and 39 previously unknown sites of star formation in the 562 deg2 covered by VVV in the Galactic bulge and the southern disk.
Published: 2013

4. Algorithmic Learning Theory : 22nd International Conference, ALT 2011, Espoo, Finland, October 5-7, 2011, Proceedings

Author: Jyriki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann, Jyriki Kivinen, Csaba Szepesvári, Esko Ukkonen, and Thomas Zeugmann
Subjects: Artificial intelligence, Machine theory, Algorithms, Computer science, Application software
Abstract: This book constitutes the refereed proceedings of the 22nd International Conference on Algorithmic Learning Theory, ALT 2011, held in Espoo, Finland, in October 2011, co-located with the 14th International Conference on Discovery Science, DS 2011. The 28 revised full papers presented together with the abstracts of 5 invited talks were carefully reviewed and selected from numerous submissions. The papers are divided into topical sections of papers on inductive inference, regression, bandit problems, online learning, kernel and margin-based methods, intelligent agents and other learning models.
Published: 2011

5. Efficient Algorithms for the Discovery of Gapped Factors

Author: Cinzia Pizzi, Esko Ukkonen, and Alberto Apostolico
Subjects: lcsh:QH426-470, Computer science, Interface (Java), Suffix tree, Monotonic function, 0102 computer and information sciences, 01 natural sciences, law.invention, Business process discovery, 03 medical and health sciences, Structural Biology, law, Enumeration, Molecular Biology, lcsh:QH301-705.5, 030304 developmental biology, 0303 health sciences, Sequence, Research, Applied Mathematics, Arbitrarily large, lcsh:Genetics, Computational Theory and Mathematics, lcsh:Biology (General), 010201 computation theory & mathematics, Algorithm, Word (computer architecture)
Abstract: Background The discovery of surprisingly frequent patterns is of paramount interest in bioinformatics and computational biology. Among the patterns considered, those consisting of pairs of solid words that co-occur within a prescribed maximum distance -or gapped factors- emerge in a variety of contexts of DNA and protein sequence analysis. A few algorithms and tools have been developed in connection with specific formulations of the problem, however, none can handle comprehensively each of the multiple ways in which the distance between the two terms in a pair may be defined. Results This paper presents efficient algorithms and tools for the extraction of all pairs of words up to an arbitrarily large length that co-occur surprisingly often in close proximity within a sequence. Whereas the number of such pairs in a sequence of n characters can be Θ(n 4), it is shown that an exhaustive discovery process can be carried out in O(n 2) or O(n 3), depending on the way distance is measured. This is made possible by a prudent combination of properties of pattern maximality and monotonicity of scores, which lead to reduce the number of word pairs to be weighed explicitly, while still producing also the scores attained by any of the pairs not explicitly considered. We applied our approach to the discovery of spaced dyads in DNA sequences. Conclusions Experiments on biological datasets prove that the method is effective and much faster than exhaustive enumeration of candidate patterns. Software is available freely by academic users via the web interface at http://bcb.dei.unipd.it:8080/dyweb.
Published: 2011

6. Finding significant matches of position weight matrices in linear time

Author: Pasi Rastas, Cinzia Pizzi, and Esko Ukkonen
Subjects: Matching (graph theory), 02 engineering and technology, String searching algorithm, Pattern Recognition, Automated, 03 medical and health sciences, Matrix (mathematics), Sequence Analysis, Protein, 0202 electrical engineering, electronic engineering, information engineering, Genetics, Humans, Pattern matching, Time complexity, Position-Specific Scoring Matrices, 030304 developmental biology, Mathematics, 0303 health sciences, Sequence database, Applied Mathematics, Computational Biology, Proteins, DNA, Sequence Analysis, DNA, 020201 artificial intelligence & image processing, Algorithm design, Algorithm, Algorithms, Biotechnology
Abstract: Position weight matrices are an important method for modeling signals or motifs in biological sequences, both in DNA and protein contexts. In this paper, we present fast algorithms for the problem of finding significant matches of such matrices. Our algorithms are of the online type, and they generalize classical multipattern matching, filtering, and superalphabet techniques of combinatorial string matching to the problem of weight matrix matching. Several variants of the algorithms are developed, including multiple matrix extensions that perform the search for several matrices in one scan through the sequence database. Experimental performance evaluation is provided to compare the new techniques against each other as well as against some other online and index-based algorithms proposed in the literature. Compared to the brute-force O(mn) approach, our solutions can be faster by a factor that is proportional to the matrix length m. Our multiple-matrix filtration algorithm had the best performance in the experiments. On a current PC, this algorithm finds significant matches (p = 0.0001) of the 123 JASPAR matrices in the human genome in about 18 minutes.
Published: 2011

7. Fast scaffolding with small independent mixed integer programs

Author: Veli Mäkinen, Leena Salmela, Johannes Ylinen, Niko Välimäki, Esko Ukkonen, Department of Computer Science, Helsinki Institute for Information Technology, Genome-scale Algorithmics research group / Veli Mäkinen, and Bioinformatics
Subjects: Statistics and Probability, Scaffold, Theoretical computer science, Source code, media_common.quotation_subject, 0206 medical engineering, education, Pseudomonas syringae, Sequence assembly, 02 engineering and technology, Biology, Biochemistry, 03 medical and health sciences, Software, Escherichia coli, Animals, Caenorhabditis elegans, Molecular Biology, Integer programming, Simulation, 030304 developmental biology, media_common, 0303 health sciences, Genome, business.industry, High-Throughput Nucleotide Sequencing, Sequence Analysis, DNA, 113 Computer and information sciences, Original Papers, Computer Science Applications, Computational Mathematics, Computational Theory and Mathematics, Graph (abstract data type), State (computer science), business, Sequence Analysis, Algorithms, 020602 bioinformatics, Integer (computer science)
Abstract: Motivation: Assembling genomes from short read data has become increasingly popular, but the problem remains computationally challenging especially for larger genomes. We study the scaffolding phase of sequence assembly where preassembled contigs are ordered based on mate pair data. Results: We present MIP Scaffolder that divides the scaffolding problem into smaller subproblems and solves these with mixed integer programming. The scaffolding problem can be represented as a graph and the biconnected components of this graph can be solved independently. We present a technique for restricting the size of these subproblems so that they can be solved accurately with mixed integer programming. We compare MIP Scaffolder to two state of the art methods, SOPRA and SSPACE. MIP Scaffolder is fast and produces better or as good scaffolds as its competitors on large genomes. Availability: The source code of MIP Scaffolder is freely available at http://www.cs.helsinki.fi/u/lmsalmel/mip-scaffolder/. Contact: leena.salmela@cs.helsinki.fi
Published: 2011

8. Integrating sequence, evolution and functional genomics in regulatory genomics

Author: Olivier Sand, Esko Ukkonen, Thomas Manke, Richard M.R. Coulson, Kimmo Palin, Jacques van Helden, Alvis Brazma, Martin Vingron, Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe), Université libre de Bruxelles (ULB), and Spinelli, Lionel
Subjects: Clinomics, [SDV]Life Sciences [q-bio], Genomics, Computational biology, Review, Biology, Genome, Structural genomics, Evolution, Molecular, 03 medical and health sciences, Regulatory Elements, Transcriptional, ComputingMilieux_MISCELLANEOUS, 030304 developmental biology, [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM], Comparative genomics, 0303 health sciences, Base Sequence, 030302 biochemistry & molecular biology, Computational genomics, Computational Biology, [SDV] Life Sciences [q-bio], [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], Databases, Nucleic Acid, Functional genomics, Protein Structure Initiative
Abstract: Finding transcription factor binding sites in regulatory regions of the genome, With genome analysis expanding from the study of genes to the study of gene regulation, 'regulatory genomics' utilizes sequence information, evolution and functional genomics measurements to unravel how regulatory information is encoded in the genome.
Published: 2009

9. On the complexity of finding gapped motifs

Author: Morris Michael, Esko Ukkonen, and François Nicolas
Subjects: String matching with don't care symbols, FOS: Computer and information sciences, Theoretical computer science, Discrete Mathematics (cs.DM), Physics::Instrumentation and Detectors, 0102 computer and information sciences, 02 engineering and technology, Computational Complexity (cs.CC), 01 natural sciences, Computer Science::Digital Libraries, Theoretical Computer Science, 0202 electrical engineering, electronic engineering, information engineering, Discrete Mathematics and Combinatorics, Motif discovery, NP-complete, Mathematics, Computer Science::Cryptography and Security, Gapped pattern, Computer Science::Software Engineering, Tandem motifs, Computer Science - Computational Complexity, Computational Theory and Mathematics, 010201 computation theory & mathematics, Computer Science::Programming Languages, 020201 artificial intelligence & image processing, Computer Science - Discrete Mathematics
Abstract: This paper has been withdrawn by the corresponding author because the newest version is now published in Journal of Discrete Algorithms., Published in Journal of Discrete Algorithms
Published: 2008

10. Fast Search Algorithms for Position Specific Scoring Matrices

Author: Pasi Rastas, Cinzia Pizzi, and Esko Ukkonen
Subjects: Incremental heuristic search, Matrix (mathematics), Search algorithm, business.industry, Computer science, Scoring algorithm, Filtration (mathematics), Pattern recognition, String searching algorithm, Artificial intelligence, business, Position-Specific Scoring Matrices
Abstract: Fast search algorithms for finding good instances of patterns given as position specific scoring matrices are developed, and some empirical results on their performance on DNA sequences are reported. The algorithms basically generalize the Aho-Corasick, filtration, and superalphabet techniques of string matching to the scoring matrix search. As compared to the naive search, our algorithms can be faster by a factor which is proportional to the length of the pattern. In our experimental comparison of different algorithms the new algorithms were clearly faster than the naive method and also faster than the well-known lookahead scoring algorithm. The Aho-Corasick technique is the fastest for short patterns and high significance thresholds of the search. For longer patterns the filtration method is better while the superalphabet technique is the best for very long patterns and low significance levels. We also observed that the actual speed of all these algorithms is very sensitive to implementation details.
Published: 2007

11. Equivalence of metabolite fragments and flow analysis of isotopomer distributions for flux estimation

Author: Esa Pitkänen, Esko Ukkonen, Hannu Maaheimo, Juho Rousu, and Ari Rantanen
Subjects: 0106 biological sciences, 0303 health sciences, Metabolite, Metabolic network, 01 natural sciences, Isotopomers, 03 medical and health sciences, chemistry.chemical_compound, chemistry, 010608 biotechnology, Isotopomer distribution, TRACER, Statistics, Biological system, Equivalence (measure theory), 030304 developmental biology, Mathematics
Abstract: The most accurate estimates of the activity of metabolic pathways are obtained by conducting isotopomer tracer experiments. The success of this method, however, is intimately dependent on the quality and amount of data on isotopomer distributions of intermediate metabolites. In this paper we present a novel method for discovering sets of metabolite fragments that always have identical isotopomer distributions, regardless of the velocities of the reactions in the metabolic network. We outline several applications of this equivalence concept, including improved propagation of measurements, experiment planning and consistency checking of metabolic network. Our computational experiments in measurement propagation indicate that the improvement via the use of this technique may be substantial.
Published: 2006

12. Planning optimal measurements of isotopomer distributions for estimation of metabolic fluxes

Author: Taneli Mielikäinen, Ari Rantanen, Esko Ukkonen, Hannu Maaheimo, and Juho Rousu
Subjects: 0106 biological sciences, Optimization problem, Magnetic Resonance Spectroscopy, Computational complexity theory, Computer science, Metabolite, Metabolic network, Bioinformatics, computer.software_genre, Central carbon metabolism, 01 natural sciences, Biochemistry, Mass Spectrometry, Isotopomers, chemistry.chemical_compound, metabolic flux analysis, Metabolic flux analysis, Protein Interaction Mapping, metabolites, 0303 health sciences, Nuclear magnetic resonance spectroscopy, metabolomics, Computer Science Applications, Computational Mathematics, Computational Theory and Mathematics, Data mining, Signal transduction, Algorithms, Signal Transduction, Statistics and Probability, Saccharomyces cerevisiae Proteins, Saccharomyces cerevisiae, isotopomer distribution, Mass spectrometry, Models, Biological, 03 medical and health sciences, Metabolomics, 010608 biotechnology, Computer Simulation, Molecular Biology, 030304 developmental biology, Estimation, Measure (data warehouse), metabolic profiling, Carbon, Metabolic pathway, chemistry, Informatics, computer, Diagnostic Techniques, Radioisotope
Abstract: Motivation: Flux estimation using isotopomer information of metabolites is currently the most reliable method to obtain quantitative estimates of the activity of metabolic pathways. However, the development of isotopomer measurement techniques for intermediate metabolites is a demanding task. Careful planning of isotopomer measurements is thus needed to maximize the available flux information while minimizing the experimental effort. Results: In this paper we study the question of finding the smallest subset of metabolites to measure that ensure the same level of isotopomer information as the measurement of every metabolite in the metabolic network. We study the computational complexity of this optimization problem in the case of the so-called positional enrichment data, give methods for obtaining exact and fast approximate solutions, and evaluate empirically the efficacy of the proposed methods by analyzing a metabolic network that models the central carbon metabolism of Saccharomyces cerevisiae. Contact: ajrantan@cs.helsinki.fi
Published: 2006

13. Optimization of cDNA-AFLP experiments using genomic sequence data

Author: Teemu Kivioja, Esko Ukkonen, Merja Penttilä, Markku Saloheimo, and Mikko Arvas
Subjects: 0106 biological sciences, Statistics and Probability, AFLP, Sequence analysis, In silico, Genomics, Biology, 01 natural sciences, Biochemistry, Polymerase Chain Reaction, amplified fragment length polymorphism, 03 medical and health sciences, Complementary DNA, genome-wide expression analysis, Molecular Biology, 030304 developmental biology, Oligonucleotide Array Sequence Analysis, Genetics, 0303 health sciences, Chromosome Mapping, food and beverages, Sequence Analysis, DNA, DNA Fingerprinting, Computer Science Applications, Random Amplified Polymorphic DNA Technique, Gene expression profiling, Computational Mathematics, Restriction enzyme, Computational Theory and Mathematics, Genetic marker, gene expression, Amplified fragment length polymorphism, DNA Probes, Sequence Alignment, Algorithms, Polymorphism, Restriction Fragment Length, cDNA microarrays, 010606 plant biology & botany
Abstract: Motivation: cDNA amplified fragment length polymorphism (cDNA-AFLP) is one of the few genome-wide level expression profiling methods capable of finding genes that have not yet been cloned or even predicted from sequence but have interesting expression patterns under the studied conditions. In cDNA-AFLP, a complex cDNA mixture is divided into small subsets using restriction enzymes and selective PCR. A large cDNA-AFLP experiment can require a substantial amount of resources, such as hundreds of PCR amplifications and gel electrophoresis runs, followed by manual cutting of a large number of bands from the gels. Our aim was to test whether this workload can be reduced by rational design of the experiment. Results: We used the available genomic sequence information to optimize cDNA-AFLP experiments beforehand so that as many transcripts as possible could be profiled with a given amount of resources. Optimization of the selection of both restriction enzymes and selective primers for cDNA-AFLP experiments has not been performed previously. The in silico tests performed suggest that substantial amounts of resources can be saved by the optimization of cDNA-AFLP experiments. Availability: A Perl implementation of the optimization method is available upon request from the authors. Contact: Teemu.Kivioja@vtt.fi
Published: 2005

14. Combinatorial Pattern Matching : 20th Annual Symposium, CPM 2009 Lille, France, June 22-24, 2009 Proceedings

Author: Gregory Kucherov, Esko Ukkonen, Gregory Kucherov, and Esko Ukkonen
Subjects: Computer algorithms--Congresses, Combinatorial analysis--Congresses, Mustervergleich--Lille <2009>--Kongress
Published: 2009

15. Predicting Gene Regulatory Elements in Silico on a Genomic Scale

Author: Alvis Brāzma, Jaak Vilo, Inge Jonassen, and Esko Ukkonen
Subjects: Genetics, Letter, In silico, Genes, Fungal, Gene Expression, Computational biology, Saccharomyces cerevisiae, Biology, Regulatory Sequences, Nucleic Acid, Genome, Sequence pattern, Substring, Transcription (biology), Gene expression, Genome, Fungal, Gene, Genetics (clinical), Yeast genome, Algorithms
Abstract: We performed a systematic analysis of gene upstream regions in the yeast genome for occurrences of regular expression-type patterns with the goal of identifying potential regulatory elements. To achieve this goal, we have developed a new sequence pattern discovery algorithm that searches exhaustively for a priori unknown regular expression-type patterns that are over-represented in a given set of sequences. We applied the algorithm in two cases, (1) discovery of patterns in the complete set of >6000 sequences taken upstream of the putative yeast genes and (2) discovery of patterns in the regions upstream of the genes with similar expression profiles. In the first case, we looked for patterns that occur more frequently in the gene upstream regions than in the genome overall. In the second case, first we clustered the upstream regions of all the genes by similarity of their expression profiles on the basis of publicly available gene expression data and then looked for sequence patterns that are over-represented in each cluster. In both cases we considered each pattern that occurred at least in some minimum number of sequences, and rated them on the basis of their over-representation. Among the highest rating patterns, most have matches to substrings in known yeast transcription factor-binding sites. Moreover, several of them are known to be relevant to the expression of the genes from the respective clusters. Experiments on simulated data show that the majority of the discovered patterns are not expected to occur by chance.
Published: 1998

16. Planning optimal measurements of isotopomer distributions for estimation of metabolic fluxes†Preliminary version of this article appeared in the proceedings of German Conference on Bioinformatics 2005. Lecture Notes in Informatics Vol. P-71 (2005), pp. 177–191.

Author: Ari Rantanen, Taneli Mielikäinen, Juho Rousu, Hannu Maaheimo, and Esko Ukkonen
Published: 2006
Full Text: View/download PDF

17. Optimization of cDNA-AFLP experiments using genomic sequence data.

Author: Teemu Kivioja, Mikko Arvas, Markku Saloheimo, Merja Penttilä, and Esko Ukkonen
Published: 2005
Full Text: View/download PDF

18. Minimum Description Length Block Finder, a Method to Identify Haplotype Blocks and to Compare the Strength of Block Boundaries

Author: William Hennah, Jesper Ekelund, Teppo Varilo, Markus Perola, Margus Lukk, Mikko Koivisto, Heikki Mannila, Leena Peltonen, and Esko Ukkonen
Subjects: Genetics, 0303 health sciences, education.field_of_study, Models, Genetic, Computer science, 030305 genetics & heredity, Population, Boundary (topology), Articles, Measure (mathematics), Retraction, Dynamic programming, 03 medical and health sciences, Probabilistic method, Haplotypes, Block (programming), Humans, Segmentation, Genetics(clinical), Minimum description length, education, Algorithm, Genetics (clinical), 030304 developmental biology
Abstract: We describe a new probabilistic method for finding haplotype blocks that is based on the use of the minimum description length (MDL) principle. We give a rigorous definition of the quality of a segmentation of a genomic region into blocks and describe a dynamic programming algorithm for finding the optimal segmentation with respect to this measure. We also describe a method for finding the probability of a block boundary for each pair of adjacent markers: this gives a tool for evaluating the significance of each block boundary. We have applied the method to the published data of Daly and colleagues. The results expose some problems that exist in the current methods for the evaluation of the significance of predicted block boundaries. Our method, MDL block finder, can be used to compare block borders in different sample sets, and we demonstrate this by applying the MDL-based method to define the block structure in chromosomes from population isolates.
Full Text: View/download PDF

19. The shortest common supersequence problem over binary alphabet is NP-complete

Author: Kari-Jouko Räihä and Esko Ukkonen
Subjects: Discrete mathematics, General Computer Science, String (computer science), 0102 computer and information sciences, 02 engineering and technology, 01 natural sciences, Binary alphabet, Shortest common supersequence, Zero (linguistics), Theoretical Computer Science, Computer Science::Other, Combinatorics, High Energy Physics::Theory, 010201 computation theory & mathematics, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Alphabet, NP-complete, Mathematics, Computer Science(all)
Abstract: We consider the complexity of the Shortest Common Supersequence (SCS) problem, i.e. the problem of finding for finite strings S 1 , S 2 ,…, S u a shortest string S such that every S i can be obtained by deleting zero or more elements from S . The SCS problem is shown to be NP-complete for strings over an alphabet of size ⩾ 2.
Full Text: View/download PDF

20. Algorithms for approximate string matching

Author: Esko Ukkonen
Subjects: Sequence, String-to-string correction problem, Bitap algorithm, String (computer science), General Engineering, 0102 computer and information sciences, 02 engineering and technology, Approximate string matching, Wagner–Fischer algorithm, 01 natural sciences, 010201 computation theory & mathematics, Damerau–Levenshtein distance, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Edit distance, Algorithm, Engineering(all), Mathematics
Abstract: The edit distance between strings a 1 … a m and b 1 … b n is the minimum cost s of a sequence of editing steps (insertions, deletions, changes) that convert one string into the other. A well-known tabulating method computes s as well as the corresponding editing sequence in time and in space O ( mn ) (in space O (min( m, n )) if the editing sequence is not required). Starting from this method, we develop an improved algorithm that works in time and in space O ( s · min( m, n )). Another improvement with time O ( s · min( m, n )) and space O ( s · min( s, m, n )) is given for the special case where all editing steps have the same cost independently of the characters involved. If the editing sequence that gives cost s is not required, our algorithms can be implemented in space O (min( s, m, n )). Since s = O (max( m, n )), the new methods are always asymptotically as good as the original tabulating method. As a by-product, algorithms are obtained that, given a threshold value t , test in time O ( t · min( m, n )) and in space O (min( t, m, n )) whether s ⩽ t . Finally, different generalized edit distances are analyzed and conditions are given under which our algorithms can be used in conjunction with extended edit operation sets, including, for example, transposition of adjacent characters.
Full Text: View/download PDF

21. Reasoning about Strings in Databases

Author: Matti Nykänen, Gösta Grahne, and Esko Ukkonen
Subjects: Polynomial hierarchy, Finite-state machine, Database, Computer Networks and Communications, Applied Mathematics, String (computer science), 0102 computer and information sciences, 02 engineering and technology, String searching algorithm, Relational algebra, computer.software_genre, 01 natural sciences, Decidability, Undecidable problem, Theoretical Computer Science, Relational calculus, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, Computational Theory and Mathematics, 010201 computation theory & mathematics, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Regular expression, computer, Mathematics
Abstract: In order to enable the database programmer to reason about relations over strings of arbitrary length we introduce alignment logic, a modal extension of relational calculus. In addition to relations, a state in the model consists of a two-dimensional array where the strings are aligned on top of each other. The basic modality in the language (a transpose, or “slide”) allows for a rearrangement of the alignment, and more complex formulas can be formed using a syntax reminiscent of regular expressions, in addition to the usual connectives and quantifiers. It turns out that the computational counterpart of the string-based portion of the logic is the class of multitape two-way finite state automata, which are devices particularly well suited for the implementation of string matching. A computational counterpart of the full logic is obtained from relational algebra by extending the selection operator into filters based on these multitape machines. Safety of formulas in alignment logic implies that new strings generated from old ones have to be of bounded length. While an undecidable property in general, this boundedness is decidable for an important subclass of formulas. As far as expressive power is concerned, alignment logic includes previous proposals for querying string databases, and gives full Turing computability. The language can be restricted to define exactly regular sets and sets in the polynomial hierarchy.
Full Text: View/download PDF

22. Sequential and indexed two-dimensional combinatorial template matching allowing rotations

Author: Gonzalo Navarro, Kimmo Fredriksson, and Esko Ukkonen
Subjects: Matching (statistics), General Computer Science, Template matching, Sublinear time, 0102 computer and information sciences, 02 engineering and technology, String searching algorithm, 01 natural sciences, Theoretical Computer Science, Image (mathematics), Index (publishing), Image processing, 010201 computation theory & mathematics, 0202 electrical engineering, electronic engineering, information engineering, Combinatorial algorithms, 020201 artificial intelligence & image processing, String matching, Focus (optics), Rotation (mathematics), Algorithm, Computer Science(all), Mathematics
Abstract: We present new and faster algorithms to search for a two-dimensional pattern in a two-dimensional text allowing any rotation of the pattern. This has applications such as image databases and computational biology. We consider the cases of exact and approximate matching under several matching models, using a combinatorial approach that generalizes string matching techniques. We focus on sequential algorithms, where only the pattern can be preprocessed, as well as on indexed algorithms, where the text is preprocessed and an index built on it. On sequential searching we derive average-case lower bounds and then obtain optimal average-case algorithms for all the matching models. At the same time, these algorithms are worst-case optimal. On indexed searching we obtain search time polylogarithmic on the text size, as well as sublinear time in general for approximate searching.
Full Text: View/download PDF

23. The complexity of maximum matroid–greedoid intersection and weighted greedoid maximization

Author: Taneli Mielikäinen and Esko Ukkonen
Subjects: Discrete mathematics, Fixed-parameter intractability, 0209 industrial biotechnology, Combinatorial optimization, Applied Mathematics, Parameterized complexity, 0102 computer and information sciences, 02 engineering and technology, Maximization, 01 natural sciences, Matroid, Satisfiability, Combinatorics, 020901 industrial engineering & automation, Intersection, 010201 computation theory & mathematics, Computer Science::Discrete Mathematics, NP-hardness, Discrete Mathematics and Combinatorics, Time complexity, Inapproximability, Mathematics, Greedoid
Abstract: The maximum intersection problem for a matroid and a greedoid, given by polynomial-time oracles, is shown NP-hard by expressing the satisfiability of boolean formulas in 3-conjunctive normal form as such an intersection. The corresponding approximation problems are shown NP-hard for certain approximation performance bounds. Moreover, some natural parameterized variants of the problem are shown W[P]-hard. The results are in contrast with the maximum matroid–matroid intersection which is solvable in polynomial time by an old result of Edmonds. We also prove that it is NP-hard to approximate the weighted greedoid maximization within 2nO(1) where n is the size of the domain of the greedoid.
Full Text: View/download PDF

24. A greedy approximation algorithm for constructing shortest common superstrings

Author: Jorma Tarhio and Esko Ukkonen
Subjects: Discrete mathematics, General Computer Science, Superstring theory, Approximation algorithm, Theoretical Computer Science, Combinatorics, symbols.namesake, Shortest Path Faster Algorithm, Greedy approximation, Shortest common superstring, symbols, Heuristics, Hamiltonian (quantum mechanics), Algorithm, Mathematics, Computer Science(all)
Abstract: An approximation algorithm for the shortest common superstring problem is developed, based on the Knuth-Morris-Pratt string-matching procedure and on the greedy heuristics for finding longest Hamiltonian paths in weighted graphs. Given a set R of strings, the algorithm constructs a common superstring for R in O(mn) steps where m is the number of strings in R and n is the total length of these strings. The performance of the algorithm is analysed in terms of the compression in the common superstrings constructed, that is, in terms of n−k where k is the length of the obtained superstring. We show that (n−k)⩾12(n−kmin) where kmin is the length of a shortest common superstring. Hence the compression achieved by the algorithm is at least half of the maximum compression. It also seems that the lengths always satisfy k⩽2·kmin but proving this remains open.
Full Text: View/download PDF

25. Efficient construction of maximal and minimal representations of motifs of a string

Author: François Nicolas, Veli Mäkinen, and Esko Ukkonen
Subjects: General Computer Science, 0206 medical engineering, Fast Fourier transform, Fast Fourier Transform, 0102 computer and information sciences, 02 engineering and technology, Longest palindromic substring, 16. Peace & justice, 01 natural sciences, Text string, Longest repeated substring problem, Substring, Longest common substring problem, Theoretical Computer Science, Combinatorics, Quadratic equation, Suffix tree, 010201 computation theory & mathematics, String processing, Time complexity, 020602 bioinformatics, Mathematics, Computer Science(all), Motif discovery
Abstract: Two substrings of a given text string are called synchronous (occurrence-equivalent) if their sets of occurrence locations are translates of each other. Linear time algorithms are given for the problems of finding a shortest and a longest substring that is synchronous with a given substring. We also introduce approximate variants of the motif discovery problem and give polynomial time algorithms for finding longest and shortest substrings whose suitably translated occurrence location set contains or, respectively, is contained in a given set of locations. The FFT technique used here also leads to an O(nlogn) algorithm for finding the maximum-content gapped motif that is synchronous with a given set of locations; the previously known algorithm for this problem is only quadratic.
Full Text: View/download PDF

26. SEQAID: a DNA sequence assembling program based on a mathematical model

Author: Hannu Peltola, Esko Ukkonen, and Hans Söderlund
Subjects: Genetics, Sequence, Theoretical computer science, biology, Base Sequence, business.industry, Sequence analysis, Computers, Process (computing), DNA sequencing theory, DNA, DNA Restriction Enzymes, Models, Theoretical, DNA sequencing, Restriction fragment, Software, Genes, biology.protein, Benchmark (computing), business
Abstract: A program package, called SEQAID, to support DNA sequencing is presented. The program automatically assembles long DNA sequences from short fragments with minimal user interaction. Various tools for controlling the assembling process are also available. The main novel features of the system are that SEQAID implements several new well-behaved algorithms based on a mathematical model of the problem. It also utilizes available information on restriction fragments to detect illegitimate overlaps and to find relationships between separately assembled sequence blocks. Experiences with the system are reported including an extremely pathological real sequence which offers an interesting benchmark for this kind of programs.
Published: 1984

27. An analytic and systematic framework for estimating metabolic flux ratios from 13C tracer experiments

Author: Ari Rantanen, Paula Jouhten, Juho Rousu, Nicola Zamboni, Esko Ukkonen, and Hannu Maaheimo
Subjects: 0106 biological sciences, Magnetic Resonance Spectroscopy, Databases, Factual, Computation, Citric Acid Cycle, Statistics as Topic, Metabolic network, Saccharomyces cerevisiae, Biology, lcsh:Computer applications to medicine. Medical informatics, 01 natural sciences, Biochemistry, Mass Spectrometry, Fungal Proteins, Pentose Phosphate Pathway, 03 medical and health sciences, Bacterial Proteins, Isomerism, Structural Biology, Artificial Intelligence, 010608 biotechnology, Metabolic flux analysis, Computer Simulation, lcsh:QH301-705.5, Molecular Biology, Independence (probability theory), 030304 developmental biology, 0303 health sciences, Fungal protein, Carbon Isotopes, Applied Mathematics, Methodology Article, Flux balance analysis, Computer Science Applications, Nonlinear system, Glucose, Flow (mathematics), lcsh:Biology (General), Research Design, Isotope Labeling, lcsh:R858-859.7, Neural Networks, Computer, Biological system, Glycolysis, Bacillus subtilis
Abstract: Background Metabolic fluxes provide invaluable insight on the integrated response of a cell to environmental stimuli or genetic modifications. Current computational methods for estimating the metabolic fluxes from 13C isotopomer measurement data rely either on manual derivation of analytic equations constraining the fluxes or on the numerical solution of a highly nonlinear system of isotopomer balance equations. In the first approach, analytic equations have to be tediously derived for each organism, substrate or labelling pattern, while in the second approach, the global nature of an optimum solution is difficult to prove and comprehensive measurements of external fluxes to augment the 13C isotopomer data are typically needed. Results We present a novel analytic framework for estimating metabolic flux ratios in the cell from 13C isotopomer measurement data. In the presented framework, equation systems constraining the fluxes are derived automatically from the model of the metabolism of an organism. The framework is designed to be applicable with all metabolic network topologies, 13C isotopomer measurement techniques, substrates and substrate labelling patterns. By analyzing nuclear magnetic resonance (NMR) and mass spectrometry (MS) measurement data obtained from the experiments on glucose with the model micro-organisms Bacillus subtilis and Saccharomyces cerevisiae we show that our framework is able to automatically produce the flux ratios discovered so far by the domain experts with tedious manual analysis. Furthermore, we show by in silico calculability analysis that our framework can rapidly produce flux ratio equations – as well as predict when the flux ratios are unobtainable by linear means – also for substrates not related to glucose. Conclusion The core of 13C metabolic flux analysis framework introduced in this article constitutes of flow and independence analysis of metabolic fragments and techniques for manipulating isotopomer measurements with vector space techniques. These methods facilitate efficient, analytic computation of the ratios between the fluxes of pathways that converge to a common junction metabolite. The framework can been seen as a generalization and formalization of existing tradition for computing metabolic flux ratios where equations constraining flux ratios are manually derived, usually without explicitly showing the formal proofs of the validity of the equations.
Full Text: View/download PDF

28. Seed-driven Learning of Position Probability Matrices from Large Sequence Sets

Author: Toivonen, Jarkko, Taipale, Jussi, Ukkonen, Esko, Schwartz, Russell, Reinert, Knut, Department of Computer Science, Combinatorial Pattern Matching research group / Esko Ukkonen, Helsinki Institute for Information Technology, Finnish Centre of Excellence in Algorithmic Data Analysis Research (Algodan), and Bioinformatics
Subjects: computational biology, 000 Computer science, knowledge, general works, education, Computer Science, DNA motifs, 113 Computer and information sciences
Abstract: We formulate and analyze a novel seed-driven algorithm SeedHam for PPM learning. To learn a PPM of length l, the algorithm uses the most frequent l-mer of the training data as a seed, and then restricts the learning into a small Hamming neighbourhood of the seed. The SeedHam method is intended for PPM learning from large sequence sets (up to hundreds of Mbases) containing enriched motif instances. A robust variant of the method is introduced that decreases contamination from artefact instances of the motif and thereby allows using larger Hamming neighbourhoods. To solve the motif orientation problem in two-stranded DNA we introduce a novel seed finding rule, based on analysis of the palindromic structure of sequences. Test experiments are reported, that illustrate the relative strengths of different variants of our methods, and show that our algorithms are fast and give stable and accurate results. Availability and implementation: A C++ implementation of the method is available from https://github.com/jttoivon/seedham/ Contact: jarkko.toivonen@cs.helsinki.fi
Published: 2017
Full Text: View/download PDF

29. LZ-End Parsing in Linear Time

Author: Kempa, Dominik, Kosolobov, Dmitry, Pruhs, Kirk, Sohler, Christian, Department of Computer Science, Combinatorial Pattern Matching research group / Esko Ukkonen, Helsinki Institute for Information Technology, Practical Algorithms and Data Structures on Strings research group / Juha Kärkkäinen, Finnish Centre of Excellence in Algorithmic Data Analysis Research (Algodan), and Genome-scale Algorithmics research group / Veli Mäkinen
Subjects: TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, 000 Computer science, knowledge, general works, LZ77, linear time, LZ-End, Computer Science, education, construction algorithm, 16. Peace & justice, 113 Computer and information sciences
Abstract: We present a deterministic algorithm that constructs in linear time and space the LZ-End parsing (a variation of LZ77) of a given string over an integer polynomially bounded alphabet.
Published: 2017

30. Lipschitz Bandits without the Lipschitz Constant

Author: Gilles Stoltz, Sébastien Bubeck, Jia Yuan Yu, Centre de Recerca Matemàtica (CRM), Universitat Autònoma de Barcelona (UAB), Département de Mathématiques et Applications - ENS Paris (DMA), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS), Groupement de Recherche et d'Etudes en Gestion à HEC (GREGH), Ecole des Hautes Etudes Commerciales (HEC Paris)-Centre National de la Recherche Scientifique (CNRS), Computational Learning, Aggregation, Supervised Statistical, Inference, and Classification (CLASSIC), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-Inria Paris-Rocquencourt, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Jyrki Kivinen, Csaba Szepesvári, Esko Ukkonen, Thomas Zeugmann, European Project: 216886,EC:FP7:ICT,FP7-ICT-2007-1,PASCAL2(2008), Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Inria Paris-Rocquencourt, and École normale supérieure - Paris (ENS Paris)
Subjects: Discrete mathematics, 0209 industrial biotechnology, Mathematical optimization, Continuum (topology), Regret, Mathematics - Statistics Theory, 0102 computer and information sciences, 02 engineering and technology, Function (mathematics), Statistics Theory (math.ST), [STAT.TH]Statistics [stat]/Statistics Theory [stat.TH], Lipschitz continuity, Minimax, 01 natural sciences, Orders of magnitude (bit rate), 020901 industrial engineering & automation, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], 010201 computation theory & mathematics, [MATH.MATH-ST]Mathematics [math]/Statistics [math.ST], FOS: Mathematics, Special case, Constant (mathematics), Mathematics
Abstract: We consider the setting of stochastic bandit problems with a continuum of arms indexed by [0, 1]d. We first point out that the strategies considered so far in the literature only provided theoretical guarantees of the form: given some tuning parameters, the regret is small with respect to a class of environments that depends on these parameters. This is however not the right perspective, as it is the strategy that should adapt to the specific bandit environment at hand, and not the other way round. Put differently, an adaptation issue is raised. We solve it for the special case of environments whose mean-payoff functions are globally Lipschitz. More precisely, we show that the minimax optimal orders of magnitude Ld/(d+2) T(d+1)/(d+2) of the regret bound over T time instances against an environment whose mean-payoff function f is Lipschitz with constant L can be achieved without knowing L or T in advance. This is in contrast to all previously known strategies, which require to some extent the knowledge of L to achieve this performance guarantee.
Published: 2011

31. How to Draw a Series-Parallel Digraph

Author: Giuseppe Di Battista, Roberto Tamassia, Ioannis G. Tollis, Paola Bertolazzi, Robert F. Cohen, Otto Nurmi, Esko Ukkonen, P., Bertolazzi, R. F., Cohen, DI BATTISTA, Giuseppe, R., Tamassia, I. G., Tollis, Paola, Bertolazzi, ROBERT F., Cohen, Roberto, Tamassia, and IOANNIS G., Tollis
Subjects: Discrete mathematics, Applied Mathematics, Structure (category theory), Parallel algorithm, Digraph, Series and parallel circuits, Theoretical Computer Science, Exponential function, Combinatorics, Computational Mathematics, Computational Theory and Mathematics, Simple (abstract algebra), Graph drawing, Geometry and Topology, Dominance drawing, Computer Science::Data Structures and Algorithms, MathematicsofComputing_DISCRETEMATHEMATICS, Mathematics
Abstract: Upward and dominance drawings of acyclic digraphs find important applications in the display of hierarchical structures such as PERT diagrams, subroutine-call charts, and is-a relationships. The combinatorial model underlying such hierarchical structures is often a series-parallel digraph. In this paper the problem of constructing upward and dominance drawings of series-parallel digraphs is investigated. We show that the area requirement of upward and dominance drawings of series-parallel digraphs crucially depends on the choice of planar embedding. Also, we present efficient sequential and parallel algorithms for drawing series-parallel digraphs. Our results show that while series-parallel digraphs have a rather simple and well understood combinatorial structure, naive drawing strategies lead to drawings with exponential area, and clever algorithms are needed to achieve optimal area.
Published: 1992

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

31 results on '"Esko Ukkonen"'

1. Accurate self-correction of errors in long reads using de Bruijn graphs

2. Longest common substrings with k mismatches

3. Mining the VVV: star formation and embedded clusters

4. Algorithmic Learning Theory : 22nd International Conference, ALT 2011, Espoo, Finland, October 5-7, 2011, Proceedings

5. Efficient Algorithms for the Discovery of Gapped Factors

6. Finding significant matches of position weight matrices in linear time

7. Fast scaffolding with small independent mixed integer programs

8. Integrating sequence, evolution and functional genomics in regulatory genomics

9. On the complexity of finding gapped motifs

10. Fast Search Algorithms for Position Specific Scoring Matrices

11. Equivalence of metabolite fragments and flow analysis of isotopomer distributions for flux estimation

12. Planning optimal measurements of isotopomer distributions for estimation of metabolic fluxes

13. Optimization of cDNA-AFLP experiments using genomic sequence data

14. Combinatorial Pattern Matching : 20th Annual Symposium, CPM 2009 Lille, France, June 22-24, 2009 Proceedings

15. Predicting Gene Regulatory Elements in Silico on a Genomic Scale

16. Planning optimal measurements of isotopomer distributions for estimation of metabolic fluxes†Preliminary version of this article appeared in the proceedings of German Conference on Bioinformatics 2005. Lecture Notes in Informatics Vol. P-71 (2005), pp. 177–191.

17. Optimization of cDNA-AFLP experiments using genomic sequence data.

18. Minimum Description Length Block Finder, a Method to Identify Haplotype Blocks and to Compare the Strength of Block Boundaries

19. The shortest common supersequence problem over binary alphabet is NP-complete

20. Algorithms for approximate string matching

21. Reasoning about Strings in Databases

22. Sequential and indexed two-dimensional combinatorial template matching allowing rotations

23. The complexity of maximum matroid–greedoid intersection and weighted greedoid maximization

24. A greedy approximation algorithm for constructing shortest common superstrings

25. Efficient construction of maximal and minimal representations of motifs of a string

26. SEQAID: a DNA sequence assembling program based on a mathematical model

27. An analytic and systematic framework for estimating metabolic flux ratios from 13C tracer experiments

28. Seed-driven Learning of Position Probability Matrices from Large Sequence Sets

29. LZ-End Parsing in Linear Time

30. Lipschitz Bandits without the Lipschitz Constant

31. How to Draw a Series-Parallel Digraph

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Category

Publication Type

Journal

Database

Publisher

31 results on '"Esko Ukkonen"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources