73 results on '"Ari Löytynoja"'
Search Results
2. Effects of marker type and filtering criteria on QST-FST comparisons
- Author
-
Zitong Li, Ari Löytynoja, Antoine Fraimout, and Juha Merilä
- Subjects
qst-fst ,natural selection ,microsatellite ,quantitative genetics ,pungitius pungitius ,single-locus polymorphisms ,Science - Abstract
Comparative studies of quantitative and neutral genetic differentiation (QST-FST tests) provide means to detect adaptive population differentiation. However, QST-FST tests can be overly liberal if the markers used deflate FST below its expectation, or overly conservative if methodological biases lead to inflated FST estimates. We investigated how marker type and filtering criteria for marker selection influence QST-FST comparisons through their effects on FST using simulations and empirical data on over 18 000 in silico genotyped microsatellites and 3.8 million single-locus polymorphism (SNP) loci from four populations of nine-spined sticklebacks (Pungitius pungitius). Empirical and simulated data revealed that FST decreased with increasing marker variability, and was generally higher with SNPs than with microsatellites. The estimated baseline FST levels were also sensitive to filtering criteria for SNPs: both minor alleles and linkage disequilibrium (LD) pruning influenced FST estimation, as did marker ascertainment. However, in the case of stickleback data used here where QST is high, the choice of marker type, their genomic location, ascertainment and filtering made little difference to outcomes of QST-FST tests. Nevertheless, we recommend that QST-FST tests using microsatellites should discard the most variable loci, and those using SNPs should pay attention to marker ascertainment and properly account for LD before filtering SNPs. This may be especially important when level of quantitative trait differentiation is low and levels of neutral differentiation high.
- Published
- 2019
- Full Text
- View/download PDF
3. Bracketing phenogenotypic limits of mammalian hybridization
- Author
-
Yoland Savriama, Mia Valtonen, Juhana I. Kammonen, Pasi Rastas, Olli-Pekka Smolander, Annina Lyyski, Teemu J. Häkkinen, Ian J. Corfe, Sylvain Gerber, Isaac Salazar-Ciudad, Lars Paulin, Liisa Holm, Ari Löytynoja, Petri Auvinen, and Jukka Jernvall
- Subjects
species hybridization ,introgression ,developmental conservation ,disparity ,morphology ,dental ,Science - Abstract
An increasing number of mammalian species have been shown to have a history of hybridization and introgression based on genetic analyses. Only relatively few fossils, however, preserve genetic material, and morphology must be used to identify the species and determine whether morphologically intermediate fossils could represent hybrids. Because dental and cranial fossils are typically the key body parts studied in mammalian palaeontology, here we bracket the potential for phenotypically extreme hybridizations by examining uniquely preserved cranio-dental material of a captive hybrid between grey and ringed seals. We analysed how distinct these species are genetically and morphologically, how easy it is to identify the hybrids using morphology and whether comparable hybridizations happen in the wild. We show that the genetic distance between these species is more than twice the modern human–Neanderthal distance, but still within that of morphologically similar species pairs known to hybridize. By contrast, morphological and developmental analyses show grey and ringed seals to be highly disparate, and that the hybrid is a predictable intermediate. Genetic analyses of the parent populations reveal introgression in the wild, suggesting that grey–ringed seal hybridization is not limited to captivity. Taken together, we postulate that there is considerable potential for mammalian hybridization between phenotypically disparate taxa.
- Published
- 2018
- Full Text
- View/download PDF
4. Pline: automatic generation of modern web interfaces for command-line programs.
- Author
-
Andres Veidenberg and Ari Löytynoja
- Published
- 2020
5. Estimating recent and historical effective population size of marine and freshwater sticklebacks
- Author
-
Xueyun Feng, Ari Löytynoja, and Juha Merilä
- Abstract
Effective population size (Ne) is a quantity of central importance in evolutionary biology and population genetics, but often notoriously challenging to estimate. Analyses ofNeare further complicated by the many interpretations of the concept and the alternative approaches to quantifyNeutilising widely different properties of the data. On the other hand, alternative methods are informative for different time scales such that a set of complementary methods should allow piecing together the entire continuum ofNefrom a few generations before the present to the distant past. To test this in practice, we inferred the continuum ofNefor 45 nine-spined stickleback populations (Pungitius pungitius) using whole-genome data. We found that the marine populations had the largest historical and recentNe, followed by coastal and other freshwater populations. We identified the impact of both recent and historical gene flow on theNeestimates obtained from different methods and found that simple summary statistics are informative in comprehending the events in the very recent past. Overall, our analyses showed that the coalescence-based trajectories ofNein the recent past and the LD-based estimates of near-contemporaryNeare incongruent, though in some cases the incongruence might be explained by specific demographic events. Despite still lacking accuracy and resolution for the very recent past, the sequentially Markovian coalescent-based methods seem to provide the most meaningful interpretation of the real-lifeNevarying across time.
- Published
- 2023
6. Generation ofde novomiRNAs from template switching during DNA replication
- Author
-
Heli A. M. Mönttinen, Mikko J. Frilander, and Ari Löytynoja
- Abstract
Micro-RNA (miRNA) genes represent one of the most constrained examples of genetic information found in metazoan genomes. All primary transcripts generated from these elements fold into a short RNA stem-loop structure that is further processed to a *22 bp dsRNA duplex. Incorporation of one of the strands into the RNA Induced Silencing complex (RISC) can then facilitate negative gene expression regulation. While a substantial number of miRNA genes are ancient and highly conserved, many are subject to rapid evolutionary change. Entirely novel miRNA genes have been shown to emerge in a lineage-specific manner in primates and plants. However, the specific mechanisms that generate these genetic elements have not been determined. Here, we show that up to 100 lineage-specific miRNA genes have emerged in primates via mutations caused by template switching during DNA replication. Template-switching mutations (TSMs) are complex mutations that can introduce in a single replication cycle inverted DNA repeats capable of forming perfect hairpin structures. Our findings demonstrate that TSMs are actively generating complex biological sequence variants which in suitable circumstances create miRNA genes from previously non-functional genomic sequences. This mechanism is orders of magnitude faster than others proposed for thede novocreation of genes, enabling near-instant rewiring of genetic information and rapid adaptation to changing environments.
- Published
- 2023
7. Determinants of genetic diversity in sticklebacks
- Author
-
Mikko Kivikoski, Xueyun Feng, Ari Löytynoja, Paolo Momigliano, and Juha Merilä
- Abstract
Understanding what determines species and population differences in levels of genetic diversity has important implications for our understanding of evolution, as well as for the conservation and management of wild populations. Previous comparative studies have emphasized the roles of linked selection, life-history trait variation and genomic properties, rather than pure demography, as important determinants of genetic diversity. However, these findings are based on coarse estimates across a range of highly diverged taxa, and it is unclear how well they represent the processes within individual species. We assessed genome-wide genetic diversity (π) in 45 nine-spined stickleback (Pungitius pungitius) populations and found thatπvaried 15-fold among populations (πmin≈0.00015,πmax≈0.0023) whereas estimates of recent effective population sizes varied 122-fold. Analysis of inbreeding coefficients (FROH) estimated from runs of homozygosity revealed strong negative association betweenπand FROH. Genetic diversity was also negatively correlated with mean body size and longevity, but these associations were not statistically significant after controlling for demographic effects (FROH). The results give strong support for the view that populations’ demographic features, rather than life history differences, are the chief determinants of genetic diversity in the wild.
- Published
- 2023
8. Predicting recombination frequency from map distance
- Author
-
Juha Merilä, Mikko Kivikoski, Ari Löytynoja, Pasi Rastas, Ecological Genetics Research Unit, Organismal and Evolutionary Biology Research Programme, Institute of Biotechnology, Bioinformatics, and Ari Pekka Löytynoja / Principal Investigator
- Subjects
0106 biological sciences ,Systematic error ,X-chromosome ,0303 health sciences ,Empirical data ,Genome ,Evolution ,Crossovers ,1184 Genetics, developmental biology, physiology ,Function (mathematics) ,Variation (game tree) ,010603 evolutionary biology ,01 natural sciences ,03 medical and health sciences ,Centimorgan ,Autosome ,Genetics ,Drosophila ,Interference ,Algorithm ,Recombination ,Genetics (clinical) ,030304 developmental biology ,Mathematics ,Construction - Abstract
Map distance is one of the key measures in genetics and indicates the expected number of crossovers between two loci. Map distance is estimated from the observed recombination frequency using mapping functions, the most widely used of those, Haldane and Kosambi, being developed at the time when the number of markers was low and unobserved crossovers had a substantial effect on the recombination fractions. In contemporary high-density marker data, the probability of multiple crossovers between adjacent loci is negligible and different mapping functions yield the same result, that is, the recombination frequency between adjacent loci is equal to the map distance in Morgans. However, high-density linkage maps contain an interpretation problem: the map distance over a long interval is additive and its association with recombination frequency is not defined. Here, we demonstrate with high-density linkage maps from humans and stickleback fishes that the inverse of Haldane or Kosambi mapping functions fail to predict the recombination frequency from map distance, and show that this is because the expected number of crossovers is not sufficient to predict recombination frequency. We formulate a piecewise function to calculate the probability of no crossovers between the markers that yields more accurate predictions of recombination frequency from map distance. Our results demonstrate that the association between map distance and recombination frequency is context-dependent and no universal solution exists. We anticipate that our study will motivate further research on this subject to yield a more accurate mathematical description of map distance in the context of modern data.
- Published
- 2022
9. Complex population history affects admixture analyses in nine-spined sticklebacks
- Author
-
Ari Löytynoja, Juha Merilä, XUEYUN FENG, Institute of Biotechnology, Organismal and Evolutionary Biology Research Programme, University of Helsinki, Ecological Genetics Research Unit, Bioinformatics, and Ari Pekka Löytynoja / Principal Investigator
- Subjects
Gene Flow ,PUNGITIUS-PUNGITIUS ,NORTH ,Genome ,introgression ,POSTGLACIAL COLONIZATION ,Fresh Water ,phylogeography ,GENETIC-STRUCTURE ,sticklebacks ,Smegmamorpha ,Europe ,Genetics, Population ,1181 Ecology, evolutionary biology ,Genetics ,EVOLUTIONARY ,DIVERGENCE ,admixture ,1182 Biochemistry, cell and molecular biology ,Animals ,THREESPINE STICKLEBACK ,ADAPTIVE INTROGRESSION ,SPECIATION ,Ecology, Evolution, Behavior and Systematics ,HYBRIDIZATION - Abstract
Introgressive hybridization is an important process in evolution but challenging to identify, undermining the efforts to understand its role and significance. On the contrary, many analytical methods assume direct descent from a single common ancestor, and admixture among populations can violate their assumptions and lead to seriously biased results. A detailed analysis of 888 whole-genome sequences of nine-spined sticklebacks (Pungitius pungitius) revealed a complex pattern of population ancestry involving multiple waves of gene flow and introgression across northern Europe. The two recognized lineages were found to have drastically different histories, and their secondary contact zone was wider than anticipated, displaying a smooth gradient of foreign ancestry with some curious deviations from the expected pattern. Interestingly, the freshwater isolates provided peeks into the past and helped to understand the intermediate states of evolutionary processes. Our analyses and findings paint a detailed picture of the complex colonization history of northern Europe and provide backdrop against which introgression and its role in evolution can be investigated. However, they also expose the challenges in analyses of admixed populations and demonstrate how hidden admixture and colonization history misleads the estimation of admixture proportions and population split times.
- Published
- 2022
10. Low heritability of crossover rate in wild sticklebacks
- Author
-
Mikko Kivikoski, Antoine Fraimout, Pasi Rastas, Ari Löytynoja, and Juha Merilä
- Abstract
Crossover rate is mostly studied with domesticated or lab-reared populations and little is known about its genetic variation in the wild. We studied the variation and genetic underpinnings of crossover rate in outbred wild nine- (Pungitius pungitius) and three-spined (Gasterosteus aculeatus) sticklebacks. In both species, the crossover rate of females exceeded that of males as did also its repeatability (RFemales =0.21–0.33, RMales=0.026–0.11), implying individual differences of crossover rate in females, but no or less so in males. However, in both species and sexes additive genetic variance and heritability of crossover rate were effectively zero. A review of the previously reported repeatability and heritability estimates revealed that the repeatabilities in stickleback females were moderately high, whereas those in males were very low. Genome-wide association analyses recovered a few candidate regions possibly involved with control of crossover rate. The low additive genetic variance of crossover rate in wild sticklebacks suggest limited evolvability of crossover rate.
- Published
- 2022
11. Fragmented habitat compensates for the adverse effects of genetic bottleneck
- Author
-
Ari Löytynoja, Pasi Rastas, Mia Valtonen, Juhana Kammonen, Liisa Holm, Morten Tange Olsen, Lars Paulin, Jukka Jernvall, Petri Auvinen, Institute of Biotechnology, Bioinformatics, Ari Pekka Löytynoja / Principal Investigator, Centre for Information Technology, Organismal and Evolutionary Biology Research Programme, Genetics, Computational genomics, Faculty of Science, Department of Geosciences and Geography, Jukka Jernvall / Principal Investigator, and DNA Sequencing and Genomics
- Subjects
runs of homozygosity ,Marine ,Evolution ,Effective population-size ,metapopulation ,Phoca-hispida ,Salmon salmo-salar ,General Biochemistry, Genetics and Molecular Biology ,Europe ,Postglacial colonization ,genetic variation ,1181 Ecology, evolutionary biology ,mammals ,habitat fragmentation ,General Agricultural and Biological Sciences ,genetic bottleneck ,pinniped - Abstract
In the face of the human-caused biodiversity crisis, understanding the theoretical basis of conservation ef-forts of endangered species and populations has become increasingly important. According to population genetics theory, population subdivision helps organisms retain genetic diversity, crucial for adaptation in a changing environment. Habitat topography is thought to be important for generating and maintaining popu-lation subdivision, but empirical cases are needed to test this assumption. We studied Saimaa ringed seals, landlocked in a labyrinthine lake and recovering from a drastic bottleneck, with additional samples from three other ringed seal subspecies. Using whole-genome sequences of 145 seals, we analyzed the distribution of variation and genetic relatedness among the individuals in relation to the habitat shape. Despite a severe his-tory of genetic bottlenecks with prevalent homozygosity in Saimaa ringed seals, we found evidence for the population structure mirroring the subregions of the lake. Our genome-wide analyses showed that the sub -populations had retained unique variation and largely complementary patterns of homozygosity, highlighting the significance of habitat connectivity in conservation biology and the power of genomic tools in under-standing its impact. The central role of the population substructure in preserving genetic diversity at the metapopulation level was confirmed by simulations. Integration of genetic analyses in conservation deci-sions gives hope to Saimaa ringed seals and other endangered species in fragmented habitats.
- Published
- 2022
12. Accurate extension of multiple sequence alignments using a phylogeny-aware graph algorithm.
- Author
-
Ari Löytynoja, Albert J. Vilella, and Nick Goldman
- Published
- 2012
- Full Text
- View/download PDF
13. Template switching in DNA replication can create and maintain RNA hairpins
- Author
-
Ari Löytynoja, Heli Mönttinen, Institute of Biotechnology, Bioinformatics, and Ari Pekka Löytynoja / Principal Investigator
- Subjects
DNA Replication ,Evolution ,DATABASE ,0206 medical engineering ,INVERTED REPEATS ,02 engineering and technology ,03 medical and health sciences ,template switch mutation ,ancestral sequence reconstruction ,Base Pairing ,030304 developmental biology ,0303 health sciences ,hairpin loop ,Multidisciplinary ,MUTAGENESIS ,Base Sequence ,SECONDARY STRUCTURE ,SEQUENCES ,MUTATIONS ,compensatory mutation ,Inverted Repeat Sequences ,1184 Genetics, developmental biology, physiology ,DNA ,Templates, Genetic ,Biological Sciences ,RNA secondary structure ,CD-HIT ,MODEL ,Mutation ,Nucleic Acid Conformation ,RNA ,CONCERTED EVOLUTION ,RIBOSOMAL-RNA ,020602 bioinformatics - Abstract
Significance RNA hairpin structures require perfect pairing between consecutive bases of the opposite sides of the stem. Random mutations are unlikely to create complex structures, so the origin of long stems and the maintenance of their perfect base pairing through compensatory substitutions have puzzled evolutionary biologists. We reconstructed ancestral sequence histories of RNA sequences and found mutation patterns consistent with template switching in DNA replication. We propose the template switch mutation mechanism as the explanation for the evolution of perfect stem structures and show that the mechanism also provides an elegant explanation for multinucleotide jumps in the sequence space and for the observed asymmetry in the stem base pair frequencies., The evolutionary origin of RNA stem structures and the preservation of their base pairing under a spontaneous and random mutation process have puzzled theoretical evolutionary biologists. DNA replication–related template switching is a mutation mechanism that creates reverse-complement copies of sequence regions within a genome by replicating briefly along either the complementary or nascent DNA strand. Depending on the relative positions and context of the four switch points, this process may produce a reverse-complement repeat capable of forming the stem of a perfect DNA hairpin or fix the base pairing of an existing stem. Template switching is typically thought to trigger large structural changes, and its possible role in the origin and evolution of RNA genes has not been studied. Here, we show that the reconstructed ancestral histories of RNA genes contain mutation patterns consistent with the DNA replication–related template switching. In addition to multibase compensatory mutations, the mechanism can explain complex sequence changes, although mutations breaking the structure rarely get fixed in evolution. Our results suggest a solution for the long-standing dilemma of RNA gene evolution and demonstrate how template switching can both create perfect stems with a single mutation event and help maintaining the stem structure over time. Interestingly, template switching also provides an elegant explanation for the asymmetric base pair frequencies within RNA stems.
- Published
- 2022
14. Determination and validation of principal gene products.
- Author
-
Michael L. Tress, Jan-Jaap Wesselink, Adam Frankish, Gonzalo López, Nick Goldman, Ari Löytynoja, Tim Massingham, Fabio Pardi, Simon Whelan, Jennifer L. Harrow, and Alfonso Valencia
- Published
- 2008
- Full Text
- View/download PDF
15. Thousands of human mutation clusters are explained by short-range template switching
- Author
-
Ari Löytynoja, Institute of Biotechnology, and Bioinformatics
- Subjects
Base pair ,1184 Genetics, developmental biology, physiology ,GENOMES ,Computational biology ,Biology ,SEQUENCE ,Genome ,DNA sequencing ,INSIGHTS ,Mutation (genetic algorithm) ,Genotype ,Genetics ,TOOL ,Human genome ,Genotyping ,Genetics (clinical) ,Sequence (medicine) - Abstract
Variation within human genomes is distributed unevenly and variants show spatial clustering. DNA-replication related template switching is a poorly known mutational mechanism capable of causing major chromosomal rearrangements as well as creating short inverted sequence copies that appear as local mutation clusters in sequence comparisons. We reanalyzed haplotype-resolved genome assemblies representing 25 human populations and multinucleotide variants aggregated from 140,000 human sequencing experiments. We found local template switching to explain thousands of complex mutation clusters across the human genome, the loci segregating within and between populations with a small number appearing as de novo mutations. We developed computational tools for genotyping candidate template switch loci using short-read sequencing data and for identification of template switch events using both short-read data and genotype data. These tools will enable building a catalogue of affected loci and studying the cellular mechanisms behind template switching both in healthy organisms and in disease. Strikingly, we noticed that widely-used analysis pipelines for short-read sequencing data - capable of identifying single nucleotide changes - may miss TSM-origin inversions of tens of base pairs, potentially invalidating medical genetic studies searching for causative alleles behind genetic diseases. Author summaryMutations are not randomly distributed in genomes and they often appear as clusters of nearby changes. We earlier showed that a poorly known mechanism in DNA replication can create short inverted copies of nearby sequence and that these events then show as mutation clusters in sequence comparison. Using the latest DNA sequencing and variation data we show that the human genome contains thousands of mutation clusters consistent with this mechanism and that novel mutations are created at a significant rate. Strikingly we observed that widely used methods for processing DNA sequencing data may completely miss these mutations. This has significance e.g. in medical genetic studies aiming to identify mutations causing genetic diseases.
- Published
- 2021
16. A hidden Markov model for progressive multiple alignment.
- Author
-
Ari Löytynoja and Michel C. Milinkovitch
- Published
- 2003
- Full Text
- View/download PDF
17. Complex population history affects admixture analyses in nine-spined sticklebacks
- Author
-
Xueyun Feng, Juha Merilä, and Ari Löytynoja
- Subjects
0106 biological sciences ,0303 health sciences ,education.field_of_study ,Population ,Introgression ,010603 evolutionary biology ,01 natural sciences ,Colonisation ,03 medical and health sciences ,Phylogeography ,Geography ,Evolutionary biology ,Contact zone ,Adaptation ,education ,030304 developmental biology ,Ancestor ,Local adaptation - Abstract
Introgressive hybridization is an important process in evolution but challenging to identify, undermining the efforts to understand its role and significance. On the other hand, many analytical methods assume direct descent from a single common ancestor, and admixture among populations can violate their assumptions and lead to seriously biased results. A detailed analysis of 888 whole genome sequences of nine-spined sticklebacks (Pungitius pungitius) revealed a complex pattern of population ancestry involving multiple waves of gene flow and introgression across northern Europe. The two recognized lineages were found to have drastically different histories and their secondary contact zone was wider than anticipated, displaying a smooth gradient of foreign ancestry with some curious deviations from the expected pattern. Interestingly, the freshwater isolates provided peeks into the past and helped to understand the intermediate states of evolutionary processes. Our analyses and findings paint a detailed picture of the complex colonization history of northern Europe and provide back-drop against which introgression and its role in evolution can be investigated. However, they also expose the challenges in analyses of admixed populations and demonstrate how hidden admixture and colonization history misleads the estimation of admixture proportions and population split times.
- Published
- 2021
18. Template switching in DNA replication can create and maintain RNA hairpins
- Author
-
Mönttinen Ham and Ari Löytynoja
- Subjects
0303 health sciences ,Base pair ,Point mutation ,030302 biochemistry & molecular biology ,DNA replication ,RNA ,Context (language use) ,Computational biology ,Biology ,03 medical and health sciences ,chemistry.chemical_compound ,chemistry ,Mutation (genetic algorithm) ,Gene ,DNA ,030304 developmental biology - Abstract
The evolutionary origin of ribonucleic acid (RNA) stem structures (1, 2) and the preservation of their base-pairing under a spontaneous and random mutation process have puzzled theoretical evolutionary biologists (3, 4). DNA replication-related template switching (5, 6) is a mutation mechanism that creates reverse-complement copies of sequence regions within a genome by replicating briefly either along the complementary or nascent DNA strand. Depending on the relative positions and context of the four switch points, this process may produce a reverse-complement repeat capable of forming the stem of a perfect DNA hairpin, or fix the base-pairing of an existing stem (7). Template switching is typically thought to trigger large structural changes (8-10) and its possible role in the origin and evolution of RNA genes has not been studied. Here we show that the reconstructed ancestral history of ribosomal RNA sequences contains compensatory base substitutions that are linked with parallel sequence changes consistent with the DNA replication-related template switching. In addition to compensatory mutations, the mechanism can explain complex changes involving non-Watson-Crick pairing and appearances of novel stem structures, though mutations breaking the structure rarely get fixed in evolution. Our results suggest a solution for the longstanding dilemma of RNA gene evolution (1, 3, 4) and demonstrate how template switching can both create perfect stem structures with a single mutation event and maintain their base pairing over time with matching changes. The mechanism can also generate parallel sequence changes, many inexplicable under the point mutation model (11), and provides an explanation for the asymmetric base-pair frequencies in stem structures (12).
- Published
- 2021
19. Automated improvement of stickleback reference genome assemblies with Lep-Anchor software
- Author
-
Ari Löytynoja, Juha Merilä, Pasi Rastas, Mikko Kivikoski, Ecological Genetics Research Unit, Organismal and Evolutionary Biology Research Programme, Institute of Biotechnology, Bioinformatics, and Ari Pekka Löytynoja / Principal Investigator
- Subjects
0106 biological sciences ,0301 basic medicine ,SELECTION ,haplotype ,Computer science ,Contiguity ,Sequence assembly ,Computational biology ,010603 evolutionary biology ,01 natural sciences ,SEQUENCE ,Genome ,03 medical and health sciences ,Pungitius ,0302 clinical medicine ,Gasterosteus ,Genetic variation ,Genetics ,DIVERGENCE ,QUALITY ,Animals ,Ecology, Evolution, Behavior and Systematics ,X chromosome ,030304 developmental biology ,Linkage (software) ,ARCHITECTURE ,0303 health sciences ,CONSTRUCTION ,Contig ,biology ,stickleback ,Stickleback ,Chromosome Mapping ,FRAMEWORK ,biology.organism_classification ,EVOLUTION ,Smegmamorpha ,TIME ,030104 developmental biology ,mosaicism ,genome assembly ,MAP ,1182 Biochemistry, cell and molecular biology ,030217 neurology & neurosurgery ,Software ,Biotechnology ,Reference genome - Abstract
SummaryWe describe an integrative approach to improve contiguity and haploidy of a reference genome assembly and demonstrate its impact with practical examples. With two novel features of Lep-Anchor software and a combination of dense linkage maps, overlap detection and bridging long reads we generated an improved assembly of the nine-spined stickleback (Pungitius pungitius) reference genome. We were able to remove a significant number of haplotypic contigs, detect more genetic variation and improve the contiguity of the genome, especially that of X chromosome. However, improved scaffolding cannot correct for mosaicism of erroneously assembled contigs, demonstrated by a de novo assembly of a 1.7 Mbp inversion. Qualitatively similar gains were obtained with the genome of three-spined stickleback (Gasterosteus aculeatus). Since the utility of genome-wide sequencing data in biological research depends heavily on the quality of the reference genome, the improved and fully automated approach described here should be helpful in refining reference genome assemblies.
- Published
- 2021
20. Evolutionary Sequence Analysis and Visualization with Wasabi
- Author
-
Andres Veidenberg, Ari Löytynoja, Katoh, Kazutaka, Ari Pekka Löytynoja / Principal Investigator, Bioinformatics, and Institute of Biotechnology
- Subjects
SELECTION ,Evolutionary sequence analysis ,Computer science ,Context (language use) ,computer.software_genre ,Reproducible research ,03 medical and health sciences ,0302 clinical medicine ,Data visualization ,Web application ,Plug-in ,030304 developmental biology ,0303 health sciences ,business.industry ,Visualization ,ALIGNMENT ,1182 Biochemistry, cell and molecular biology ,TREES ,Web service ,User interface ,business ,Software engineering ,computer ,030217 neurology & neurosurgery - Abstract
Wasabi is an open-source, web-based graphical environment for evolutionary sequence analysis and visualization, designed to work with multiple sequence alignments within their phylogenetic context. Its interactive user interface provides convenient access to external data sources and computational tools and is easily extendable with custom tools and pipelines using a plugin system. Wasabi stores intermediate editing and analysis steps as workflow histories and provides direct-access web links to datasets, allowing for reproducible, collaborative research, and easy dissemination of the results. In addition to shared analyses and installation-free usage, the web-based design allows Wasabi to be run as a cross-platform, stand-alone application and makes its integration to other web services straightforward.This chapter gives a detailed description and guidelines for the use of Wasabi's analysis environment. Example use cases will give step-by-step instructions for practical application of the public Wasabi, from quick data visualization to branched analysis pipelines and publishing of results. We end with a brief discussion of advanced usage of Wasabi, including command-line communication, interface extension, offline usage, and integration to local and public web services. The public Wasabi application, its source code, documentation, and other materials are available at http://wasabiapp.org.
- Published
- 2021
21. Phylogeny-Aware Alignment with PRANK and PAGAN
- Author
-
Ari Löytynoja, Katoh, Kazutaka, Ari Pekka Löytynoja / Principal Investigator, Bioinformatics, and Institute of Biotechnology
- Subjects
0106 biological sciences ,0303 health sciences ,Evolutionary sequence analysis ,Insertions and deletions ,Phylogenetic tree ,Computer science ,business.industry ,ERRORS ,computer.software_genre ,Character homology ,010603 evolutionary biology ,01 natural sciences ,MULTIPLE SEQUENCE ALIGNMENT ,03 medical and health sciences ,Phylogenetics ,DISTANCE ,1182 Biochemistry, cell and molecular biology ,TREES ,Artificial intelligence ,Phylogeny-aware alignment ,business ,computer ,Natural language processing ,030304 developmental biology - Abstract
Evolutionary analyses require sequence alignments that correctly represent evolutionary homology. Evolutionary homology and proteins' structural similarity are not the same and sequence alignments generated with methods designed for structural matching can be seriously misleading in comparative and phylogenetic analyses. The phylogeny-aware alignment algorithm implemented in the program PRANK has been shown to produce good alignments for evolutionary inferences. Unlike other alignment programs, PRANK makes use of phylogenetic information to distinguish alignment gaps caused by insertions or deletions and, thereafter, handles the two types of events differently. As a by-product of the correct handling of insertions and deletions, PRANK can provide the inferred ancestral sequences as a part of the output and mark the alignment gaps differently depending on their origin in insertion or deletion events. As the algorithm infers the evolutionary history of the sequences, PRANK can be sensitive to errors in the guide phylogeny and violations on the underlying assumptions about the origin and patterns of gaps. To mitigate the effects of such model violations, the phylogeny-aware alignment algorithm has been re-implemented in program PAGAN. By using sequence graphs, PAGAN can model and accumulate evidence from more complex gap structures than PRANK does, and incorporate this uncertainty in the inferred ancestral sequences. These issues are discussed in detail below and practical advice is provided for the use of PRANK and PAGAN in evolutionary analysis. The two software packages can be downloaded from http://wasabiapp.org/software.
- Published
- 2020
22. Evolutionary Sequence Analysis and Visualization with Wasabi
- Author
-
Andres, Veidenberg and Ari, Löytynoja
- Subjects
Evolution, Molecular ,Internet ,User-Computer Interface ,Computational Biology ,Sequence Analysis, DNA ,Sequence Alignment ,Sequence Analysis ,Algorithms ,Phylogeny ,Software ,Workflow - Abstract
Wasabi is an open-source, web-based graphical environment for evolutionary sequence analysis and visualization, designed to work with multiple sequence alignments within their phylogenetic context. Its interactive user interface provides convenient access to external data sources and computational tools and is easily extendable with custom tools and pipelines using a plugin system. Wasabi stores intermediate editing and analysis steps as workflow histories and provides direct-access web links to datasets, allowing for reproducible, collaborative research, and easy dissemination of the results. In addition to shared analyses and installation-free usage, the web-based design allows Wasabi to be run as a cross-platform, stand-alone application and makes its integration to other web services straightforward.This chapter gives a detailed description and guidelines for the use of Wasabi's analysis environment. Example use cases will give step-by-step instructions for practical application of the public Wasabi, from quick data visualization to branched analysis pipelines and publishing of results. We end with a brief discussion of advanced usage of Wasabi, including command-line communication, interface extension, offline usage, and integration to local and public web services. The public Wasabi application, its source code, documentation, and other materials are available at http://wasabiapp.org.
- Published
- 2020
23. Phylogeny-Aware Alignment with PRANK and PAGAN
- Author
-
Ari, Löytynoja
- Subjects
Evolution, Molecular ,Mutagenesis, Insertional ,Base Sequence ,Reproducibility of Results ,Sequence Analysis, DNA ,Sequence Alignment ,Algorithms ,Phylogeny ,Software ,Sequence Deletion - Abstract
Evolutionary analyses require sequence alignments that correctly represent evolutionary homology. Evolutionary homology and proteins' structural similarity are not the same and sequence alignments generated with methods designed for structural matching can be seriously misleading in comparative and phylogenetic analyses. The phylogeny-aware alignment algorithm implemented in the program PRANK has been shown to produce good alignments for evolutionary inferences. Unlike other alignment programs, PRANK makes use of phylogenetic information to distinguish alignment gaps caused by insertions or deletions and, thereafter, handles the two types of events differently. As a by-product of the correct handling of insertions and deletions, PRANK can provide the inferred ancestral sequences as a part of the output and mark the alignment gaps differently depending on their origin in insertion or deletion events. As the algorithm infers the evolutionary history of the sequences, PRANK can be sensitive to errors in the guide phylogeny and violations on the underlying assumptions about the origin and patterns of gaps. To mitigate the effects of such model violations, the phylogeny-aware alignment algorithm has been re-implemented in program PAGAN. By using sequence graphs, PAGAN can model and accumulate evidence from more complex gap structures than PRANK does, and incorporate this uncertainty in the inferred ancestral sequences. These issues are discussed in detail below and practical advice is provided for the use of PRANK and PAGAN in evolutionary analysis. The two software packages can be downloaded from http://wasabiapp.org/software .
- Published
- 2020
24. Genetic population structure constrains local adaptation in sticklebacks
- Author
-
Baocheng Guo, Pasi Rastas, Juha Merilä, Takahito Shikano, Ari Löytynoja, Zitong Li, Bohao Fang, Petri Kemppainen, Jing Yang, Ecological Genetics Research Unit, Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences, Institute of Biotechnology, Bioinformatics, and Ari Pekka Löytynoja / Principal Investigator
- Subjects
0106 biological sciences ,0301 basic medicine ,epistasis ,Pungitius pungitius ,Genetic Linkage ,Population ,Quantitative trait locus ,Pitx1 ,010603 evolutionary biology ,01 natural sciences ,pelvic reduction ,03 medical and health sciences ,Genetic variation ,Genetics ,Animals ,14. Life underwater ,convergent evolution ,education ,Ecology, Evolution, Behavior and Systematics ,Local adaptation ,education.field_of_study ,Genome ,biology ,Stickleback ,Chromosome Mapping ,biology.organism_classification ,Genetic architecture ,Smegmamorpha ,030104 developmental biology ,Genetics, Population ,Evolutionary biology ,1181 Ecology, evolutionary biology ,Epistasis ,Genetic isolate ,local adaptation - Abstract
Repeated and independent adaptation to specific environmental conditions from standing genetic variation is common. However, if genetic variation is limited, the evolution of similar locally adapted traits may be restricted to genetically different and potentially less optimal solutions or prevented from happening altogether. Using a quantitative trait locus (QTL) mapping approach, we identified the genomic regions responsible for the repeated pelvic reduction (PR) in three crosses between nine-spined stickleback populations expressing full and reduced pelvic structures. In one cross, PR mapped to linkage group 7 (LG7) containing the gene Pitx1, known to control pelvic reduction also in the three-spined stickleback. In the two other crosses, PR was polygenic and attributed to 10 novel QTL, of which 90% were unique to specific crosses. When screening the genomes from 27 different populations for deletions in the Pitx1 regulatory element, these were only found in the population in which PR mapped to LG7, even though the morphological data indicated large-effect QTL for PR in several other populations as well. Consistent with the available theory and simulations parameterized on empirical data, we hypothesize that the observed variability in genetic architecture of PR is due to heterogeneity in the spatial distribution of standing genetic variation caused by >2x stronger population structuring among freshwater populations and >10x stronger genetic isolation by distance in the sea in nine-spined sticklebacks as compared to three-spined sticklebacks.
- Published
- 2020
25. Pline: automatic generation of modern web interfaces for command-line programs
- Author
-
Ari Löytynoja and Andres Veidenberg
- Subjects
FOS: Computer and information sciences ,Web standards ,business.industry ,Computer science ,computer.software_genre ,Quantitative Biology - Quantitative Methods ,Pipeline (software) ,Software Engineering (cs.SE) ,Computer Science - Software Engineering ,Interactivity ,FOS: Biological sciences ,Web page ,Web application ,Plug-in ,User interface ,business ,Software engineering ,computer ,Quantitative Methods (q-bio.QM) ,Graphical user interface - Abstract
Background: Bioinformatics software often lacks graphical user interfaces (GUIs), which can limit its adoption by non-technical members of the scientific community. Web interfaces are a common alternative for building cross-platform GUIs, but their potential is underutilized: web interfaces for command-line tools rarely take advantage of the level of interactivity expected of modern web applications and are rarely usable offline.Results: Here we present Pline: a lightweight framework that uses program descriptions and web standards to generate dynamic GUIs for command-line programs. We introduce a plugin system for creating Pline interfaces and provide an online repository for sharing third-party plugins. We demonstrate Pline’s versatility with example interfaces, a graphical pipeline for sequence analysis and integration to Wasabi web application.Conclusions: Pline is cross-platform, open-source software that can be integrated to web pages or used as a standalone desktop application. Pline provides graphical interfaces that are easy to create and maintain, fostering user-friendly software in science. Documentation, demo website, example plugins and source code is freely available from http://wasabiapp.org/pline. Keywords: Bioinformatics; Software Engineering; User Interfaces; Web Technologies
- Published
- 2020
26. An inducible genome editing system for plants
- Author
-
Ari Pekka Mähönen, Xin Wang, Robertas Ursache, Munan Lyu, Lingling Ye, Ari Löytynoja, Helsinki Institute of Life Science HiLIFE, Institute of Biotechnology, Organismal and Evolutionary Biology Research Programme, Viikki Plant Science Centre (ViPS), and Ari Pekka Löytynoja / Principal Investigator
- Subjects
0106 biological sciences ,0301 basic medicine ,PROTEINS ,Plant Science ,Computational biology ,01 natural sciences ,Article ,Gene Knockout Techniques ,03 medical and health sciences ,Genome editing ,Gene Expression Regulation, Plant ,Arabidopsis ,Gene expression ,STEM-CELL MAINTENANCE ,Gene ,GENE-EXPRESSION ,Gene Editing ,Developmental stage ,biology ,DELETION ,Plants ,Plants, Genetically Modified ,biology.organism_classification ,ARABIDOPSIS ,11831 Plant biology ,Null allele ,030104 developmental biology ,DNA-DAMAGE ,VECTORS ,AUXIN TRANSPORT ,GNOM ,Genome, Plant ,Function (biology) ,010606 plant biology & botany ,GENERATION - Abstract
Conditional manipulation of gene expression is a key approach to investigating the primary function of a gene in a biological process. While conditional and cell-type-specific overexpression systems exist for plants, there are currently no systems available to disable a gene completely and conditionally. Here, we present a new tool with which target genes can efficiently and conditionally be knocked out by genome editing at any developmental stage. Target genes can also be knocked out in a cell-type-specific manner. Our tool is easy to construct and will be particularly useful for studying genes having null alleles that are non-viable or show pleiotropic developmental defects.
- Published
- 2020
27. We shall meet again - Genomics of historical admixture in the sea
- Author
-
Ari Löytynoja, Xueyun Feng, and Juha Merilä
- Subjects
0106 biological sciences ,0303 health sciences ,Phylogenetic tree ,Introgression ,Genomics ,Biology ,010603 evolutionary biology ,01 natural sciences ,Source Population ,03 medical and health sciences ,Hybrid zone ,Evolutionary biology ,14. Life underwater ,North sea ,Selection (genetic algorithm) ,030304 developmental biology ,Local adaptation - Abstract
We studied the impact of genetic introgression in evolution and on evolutionary studies with whole-genome data from two divergent lineages of sticklebacks. Our results reveal that the hybrid zone between the lineages ranges across the entire Baltic Sea and parts of the North Sea with the foreign ancestry decreasing with increasing distance to the source population. Introgression has also penetrated currently isolated freshwater populations. We identified footprints of selection on regions enriched for introgressed variants, suggesting that some of the introgression has been adaptive. However, the levels of introgression were in general negatively correlated with the recombination rate, suggesting that the introgression has been largely neutral and adaptive ancestral standing variation likely had a more important role in shaping the genomic landscape. Our results further suggest that overlooked introgression can mislead analyses of local adaptation and phylogenetic affinities, highlighting the importance of accounting for introgression in studies of local adaptation.
- Published
- 2020
- Full Text
- View/download PDF
28. Genetic population structure constrains local adaptation in sticklebacks
- Author
-
Petri Kemppainen, Juha Merilä, Pasi Rastas, Takahito Shikano, Baocheng Guo, Jing Yang, Zitong Li, Bohao Fang, and Ari Löytynoja
- Subjects
0106 biological sciences ,0303 health sciences ,education.field_of_study ,Population ,Stickleback ,Biology ,Quantitative trait locus ,biology.organism_classification ,010603 evolutionary biology ,01 natural sciences ,Genetic architecture ,03 medical and health sciences ,Evolutionary biology ,Genetic variation ,14. Life underwater ,Adaptation ,education ,Genetic isolate ,030304 developmental biology ,Local adaptation - Abstract
Repeated and independent adaptation to specific environmental conditions from standing genetic variation is common. However, if genetic variation is limited, the evolution of similar locally adapted traits may be restricted to genetically different and potentially less optimal solutions or prevented from happening altogether. Using a quantitative trait locus (QTL) mapping approach, we identified the genomic regions responsible for the repeated pelvic reduction (PR) in three crosses between nine-spined stickleback populations expressing full and reduced pelvic structures. In one cross, PR mapped to linkage group 7 (LG7) containing the genePitx1, known to control pelvic reduction also in the three-spined stickleback. In the two other crosses, PR was polygenic and attributed to ten novel QTL, of which 90% were unique to specific crosses. When screening the genomes from 27 different populations for deletions in thePitx1regulatory element, these were only found in the population in which PR mapped to LG7, even though the morphological data indicated large effect QTL for PR in several other populations as well. Consistent with the available theory and simulations parameterised on empirical data, we hypothesise that the observed variability in genetic architecture of PR is due to heterogeneity in the spatial distribution of standing genetic variation caused by >2x stronger population structuring among freshwater populations and >10x stronger genetic isolation by distance in the sea in nine-spined sticklebacks as compared to three-spined sticklebacks.
- Published
- 2020
29. A High-Quality Assembly of the Nine-Spined Stickleback (Pungitius pungitius) Genome
- Author
-
Baocheng Guo, Juha Merilä, Ari Löytynoja, Federico C. F. Calboli, Michael Matschiner, Alexander J. Nederbragt, Pasi Rastas, Kjetill S. Jakobsen, Srinidhi Varadharajan, Ecological Genetics Research Unit, Organismal and Evolutionary Biology Research Programme, Faculty of Biological and Environmental Sciences, Institute of Biotechnology, Bioinformatics, Ari Pekka Löytynoja / Principal Investigator, University of Zurich, and Venkatesh, B
- Subjects
Male ,0106 biological sciences ,TANDEM REPEATS ,comparative genomics ,10125 Paleontological Institute and Museum ,01 natural sciences ,Genome ,Hemoglobins ,Copy-number variation ,Phylogeny ,Recombination, Genetic ,0303 health sciences ,education.field_of_study ,Ecology ,biology ,stickleback ,1184 Genetics, developmental biology, physiology ,Stickleback ,560 Fossils & prehistoric life ,1181 Ecology, evolutionary biology ,Female ,Research Article ,Fish Proteins ,Pungitius pungitius ,Population ,MICROSATELLITES ,Gasterosteus ,010603 evolutionary biology ,SEQUENCE ,GENETIC ARCHITECTURE ,Evolution, Molecular ,03 medical and health sciences ,PELVIC REDUCTION ,Pungitius ,1311 Genetics ,Behavior and Systematics ,EUKARYOTIC GENOMES ,Genetics ,Animals ,14. Life underwater ,education ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,Synteny ,Comparative genomics ,DATING METHODS ,Molecular Sequence Annotation ,DNA ,biology.organism_classification ,EVOLUTION ,Perciformes ,1105 Ecology, Evolution, Behavior and Systematics ,Evolutionary biology ,DNA Transposable Elements ,genome assembly ,TRANSPOSABLE ELEMENTS ,Microsatellite Repeats - Abstract
The Gasterosteidae fish family hosts several species that are important models for eco-evolutionary, genetic and genomic research. In particular, a wealth of genetic and genomic data has been generated for the three-spined stickleback (Gasterosteus aculeatus), the ‘ecology’s supermodel’, while the genomic resources for the nine-spined stickleback (Pungitius pungitius) have remained relatively scarce. Here, we report a high-quality chromosome-level genome assembly of P. pungitius consisting of 5,303 contigs (N50 = 1.2 Mbp) with a total size of 521 Mbp. These contigs were mapped to 21 linkage groups using a high-density linkage map, yielding a final assembly with 98.5% BUSCO completeness. A total of 25,062 protein-coding genes were annotated, and ca. 23% of the assembly was found to consist of repetitive elements. A comprehensive analysis of repetitive elements uncovered centromeric-specific tandem repeats and provided insights into the evolution of retrotransposons. A multigene phylogenetic analysis inferred a divergence time of about 26 million years (MYA) between nine- and three-spined sticklebacks, which is far older than the commonly assumed estimate of 13 MYA. Compared to the three-spined stickleback, we identified an additional duplication of several genes in the hemoglobin cluster. Sequencing data from populations adapted to different environments indicated potential copy number variations in hemoglobin genes. Furthermore, genome-wide synteny comparisons between three- and nine-spined sticklebacks identified chromosomal rearrangements underlying the karyotypic differences between the two species. The high-quality chromosome-scale assembly of the nine-spined stickleback genome obtained with long-read sequencing technology provides a crucial resource for comparative and population genomic investigations of stickleback fishes and teleosts.
- Published
- 2019
30. Effects of marker type and filtering criteria on QST-FST comparisons
- Author
-
Juha Merilä, Ari Löytynoja, Zitong Li, Antoine Fraimout, Environmental and Ecological Statistics Group, Organismal and Evolutionary Biology Research Programme, Ari Pekka Löytynoja / Principal Investigator, Bioinformatics, Institute of Biotechnology, Ecology and Evolutionary Biology, and Ecological Genetics Research Unit
- Subjects
0106 biological sciences ,microsatellite ,quantitative genetics ,Population ,qst-fst ,Biology ,010603 evolutionary biology ,01 natural sciences ,pungitius pungitius ,03 medical and health sciences ,Type (biology) ,lcsh:Science ,education ,030304 developmental biology ,0303 health sciences ,education.field_of_study ,Multidisciplinary ,Natural selection ,natural selection ,Quantitative genetics ,Genetic differentiation ,Evolutionary biology ,1181 Ecology, evolutionary biology ,Microsatellite ,lcsh:Q ,single-locus polymorphisms ,human activities - Abstract
Comparative studies of quantitative and neutral genetic differentiation (QST-FSTtests) provide means to detect adaptive population differentiation. However,QST-FSTtests can be overly liberal if the markers used deflateFSTbelow its expectation, or overly conservative if methodological biases lead to inflatedFSTestimates. We investigated how marker type and filtering criteria for marker selection influenceQST-FSTcomparisons through their effects onFSTusing simulations and empirical data on over 18 000in silicogenotyped microsatellites and 3.8 million single-locus polymorphism (SNP) loci from four populations of nine-spined sticklebacks (Pungitius pungitius). Empirical and simulated data revealed thatFSTdecreased with increasing marker variability, and was generally higher with SNPs than with microsatellites. The estimated baselineFSTlevels were also sensitive to filtering criteria for SNPs: both minor alleles and linkage disequilibrium (LD) pruning influencedFSTestimation, as did marker ascertainment. However, in the case of stickleback data used here whereQSTis high, the choice of marker type, their genomic location, ascertainment and filtering made little difference to outcomes ofQST-FSTtests. Nevertheless, we recommend thatQST-FSTtests using microsatellites should discard the most variable loci, and those using SNPs should pay attention to marker ascertainment and properly account for LD before filtering SNPs. This may be especially important when level of quantitative trait differentiation is low and levels of neutral differentiation high.
- Published
- 2019
31. Metabarcoding Gastrointestinal Nematodes in Sympatric Endemic and Nonendemic Species in Ranomafana National Park, Madagascar
- Author
-
Juha Laakkonen, Ari Löytynoja, Alan Medlar, Tuomas Aivelo, Jukka Jernvall, Institute of Biotechnology, Biosciences, Ari Pekka Löytynoja / Principal Investigator, Bioinformatics, Departments of Faculty of Veterinary Medicine, Veterinary Biosciences, Veterinary Anatomy and Developmental Biology, Jukka Jernvall / Principal Investigator, Helsinki Institute of Sustainability Science (HELSUS), Computational genomics, Global Change and Conservation Lab, and Teachers' Academy
- Subjects
0106 biological sciences ,0301 basic medicine ,Lemurs ,PARASITES ,TRANSMISSION ,DIVERSITY ,Biodiversity ,Zoology ,Lemur ,DNA BARCODE ,Biology ,010603 evolutionary biology ,01 natural sciences ,DNA barcoding ,Invasive species ,03 medical and health sciences ,biology.animal ,Parasite hosting ,Ecology, Evolution, Behavior and Systematics ,Noninvasive sampling ,Host (biology) ,030104 developmental biology ,RAIN-FORESTS ,Sympatric speciation ,Animal ecology ,1181 Ecology, evolutionary biology ,Metabarcoding ,POPULATIONS ,BIODIVERSITY ,Animal Science and Zoology ,ALOUATTA-PALLIATA-MEXICANA ,NONINVASIVE ASSESSMENT ,COMMUNITIES - Abstract
Sympatric species are known to host the same parasites species. Nevertheless, surveys examining parasite assemblages in sympatric species are rare. To understand how parasite assemblages in sympatric host species differ in a given locality, we used a noninvasive identification method based on high-throughput sequencing. We collected fecal samples from sympatric species in Ranomafana National Park, Madagascar, from September to December in 2010, 2011, and 2012 and identified their parasites by metabarcoding, sequencing a region of the small ribosomal subunit (18S) gene. Our survey included 11 host species, including endemic primates, rodents, frogs, gastropods, and nonendemic rats and dogs. We collected 872 samples, of which 571 contained nematodes and 249 were successfully sequenced. We identified nine putative species of parasites, although their correspondence to actual parasite species is not clear as the resolution of the marker gene differs between nematode clades. For the host species that we successfully sampled with 10 or more positive occurrences of nematodes, i.e., mouse lemurs (Microcebus rufus), black rats (Rattus rattus), and frogs (Anura), the parasite assemblage compositions differed significantly among host species, sampling sites, and sampling years. Our metabarcoding method shows promise in interrogating parasite assemblages in sympatric host species and our results emphasize the importance of choosing marker regions for parasite identification accuracy.
- Published
- 2018
32. MATLIGN: a motif clustering, comparison and matching tool.
- Author
-
Matti Kankainen and Ari Löytynoja
- Published
- 2007
- Full Text
- View/download PDF
33. Short template switch events explain mutation clusters in the human genome
- Author
-
Ari Löytynoja and Nick Goldman
- Subjects
0301 basic medicine ,Pan troglodytes ,Computer science ,Inverted Repeat Sequences ,Inverted repeat ,Method ,Genomics ,Computational biology ,Biology ,medicine.disease_cause ,Polymorphism, Single Nucleotide ,Genome ,DNA sequencing ,Frameshift mutation ,03 medical and health sciences ,0302 clinical medicine ,INDEL Mutation ,Genetic variation ,medicine ,Genetics ,Animals ,Humans ,Genetics (clinical) ,030304 developmental biology ,Sequence (medicine) ,0303 health sciences ,Mutation ,Base Sequence ,Models, Genetic ,Mechanism (biology) ,Genome, Human ,Positive selection ,High-Throughput Nucleotide Sequencing ,Biological Evolution ,030104 developmental biology ,Mutation (genetic algorithm) ,Human genome ,Sequence Alignment ,030217 neurology & neurosurgery - Abstract
Resequencing efforts are uncovering the extent of genetic variation in humans and provide data to study the evolutionary processes shaping our genome. One recurring puzzle in both intra- and inter-species studies is the high frequency of complex mutations comprising multiple nearby base substitutions or insertion-deletions. We devised a generalized mutation model of template switching during replication that extends existing models of genome rearrangement, and used this to study the role of template switch events in the origin of such mutation clusters. Applied to the human genome, our model detects thousands of template switch events during the evolution of human and chimp from their common ancestor, and hundreds of events between two independently sequenced human genomes. While many of these are consistent with the template switch mechanism previously proposed for bacteria but not thought significant in higher organisms, our model also identifies new types of mutations that create short inversions, some flanked by paired inverted repeats. The local template switch process can create numerous complex mutation patterns, including hairpin loop structures, and explains multi-nucleotide mutations and compensatory substitutions without invoking positive selection, complicated and speculative mechanisms, or implausible coincidence. Clustered sequence differences are challenging for mapping and variant calling methods, and we show that detection of mutation clusters with current resequencing methodologies is difficult and many erroneous variant annotations exist in human reference data. Template switch events such as those we have uncovered may have been neglected as an explanation for complex mutations because of biases in commonly used analyses. Incorporation of our model into reference-based analysis pipelines and comparisons ofde novo-assembled genomes will lead to improved understanding of genome variation and evolution.
- Published
- 2017
34. Mechanistic insights into the evolution of DUF26-containing proteins in land plants
- Author
-
Andres Veidenberg, Omid Safronov, Michael Hothorn, Jaakko Kangasjärvi, Jarkko Salojärvi, Markéta Luklová, Ari Löytynoja, Michael Wrzaczek, Sitaram Rajaraman, Aleksia Vaattovaara, Benjamin Brandt, School of Biological Sciences, Organismal and Evolutionary Biology Research Programme, Viikki Plant Science Centre (ViPS), Plant-Fungal Interactions Group, Bioinformatics for Molecular Biology and Genomics (BMBG), Institute of Biotechnology, Plant Biology, Plant ROS-Signalling, Bioinformatics, Ari Pekka Löytynoja / Principal Investigator, and Receptor-Ligand Signaling Group
- Subjects
0106 biological sciences ,Plant Evolution ,Protein family ,Gene Dosage ,Medicine (miscellaneous) ,Computational biology ,Biology ,01 natural sciences ,Gene dosage ,Genome ,Molecular Evolution ,Article ,General Biochemistry, Genetics and Molecular Biology ,Evolution, Molecular ,03 medical and health sciences ,Phylogenetics ,Gene Expression Regulation, Plant ,Gene Duplication ,Gene duplication ,Gene family ,lcsh:QH301-705.5 ,Gene ,Phylogeny ,030304 developmental biology ,Plant Proteins ,0303 health sciences ,Genetic Drift ,Intracellular Signaling Peptides and Proteins ,Biological sciences [Science] ,Molecular Sequence Annotation ,15. Life on land ,DNA-Binding Proteins ,ddc:580 ,Gene Ontology ,lcsh:Biology (General) ,1181 Ecology, evolutionary biology ,Embryophyta ,Domain of unknown function ,General Agricultural and Biological Sciences ,Protein Kinases ,Genome, Plant ,010606 plant biology & botany - Abstract
Large protein families are a prominent feature of plant genomes and their size variation is a key element for adaptation. However, gene and genome duplications pose difficulties for functional characterization and translational research. Here we infer the evolutionary history of the DOMAIN OF UNKNOWN FUNCTION (DUF) 26-containing proteins. The DUF26 emerged in secreted proteins. Domain duplications and rearrangements led to the appearance of CYSTEINE-RICH RECEPTOR-LIKE PROTEIN KINASES (CRKs) and PLASMODESMATA-LOCALIZED PROTEINS (PDLPs). The DUF26 is land plant-specific but structural analyses of PDLP ectodomains revealed strong similarity to fungal lectins and thus may constitute a group of plant carbohydrate-binding proteins. CRKs expanded through tandem duplications and preferential retention of duplicates following whole genome duplications, whereas PDLPs evolved according to the dosage balance hypothesis. We propose that new gene families mainly expand through small-scale duplications, while fractionation and genetic drift after whole genome multiplications drive families towards dosage balance., Aleksia Vaattovaara et al. investigate the evolutionary history of a representative protein family, the DUF26-containing proteins, which is specific to land plants. They suggest that domain duplications and rearrangement led to the protein family’s two main subclasses.
- Published
- 2018
35. Bracketing phenogenotypic limits of mammalian hybridization
- Author
-
Juhana Kammonen, Lars Paulin, Mia Valtonen, Sylvain Gerber, Liisa Holm, Teemu J. Häkkinen, Pasi Rastas, Annina Lyyski, Petri Auvinen, Olli-Pekka Smolander, Yoland Savriama, Ian J. Corfe, Ari Löytynoja, Jukka Jernvall, Isaac Salazar-Ciudad, Institute of Biotechnology, Organismal and Evolutionary Biology Research Programme, Computational genomics, Bioinformatics, Faculty of Biological and Environmental Sciences, Ari Pekka Löytynoja / Principal Investigator, DNA Sequencing and Genomics, and Jukka Jernvall / Principal Investigator
- Subjects
0106 biological sciences ,0301 basic medicine ,Morphology ,SUPERNUMERARY TEETH ,Introgression ,dental ,introgression ,Biology ,010603 evolutionary biology ,01 natural sciences ,SEQUENCE ,Gene flow ,03 medical and health sciences ,Developmental conservation ,EVOLUTIONARY HISTORY ,Genetic algorithm ,morphology ,GENE FLOW ,lcsh:Science ,Bracketing ,SPECIATION ,developmental conservation ,Multidisciplinary ,Disparity ,Biology (Whole Organism) ,species hybridization ,Species hybridization ,INDIVIDUALS ,030104 developmental biology ,disparity ,Evolutionary biology ,DENTITION ,1181 Ecology, evolutionary biology ,Dental ,lcsh:Q ,SEAL ,Research Article ,GENERATION - Abstract
An increasing number of mammalian species have been shown to have a history of hybridization and introgression based on genetic analyses. Only relatively few fossils, however, preserve genetic material, and morphology must be used to identify the species and determine whether morphologically intermediate fossils could represent hybrids. Because dental and cranial fossils are typically the key body parts studied in mammalian palaeontology, here we bracket the potential for phenotypically extreme hybridizations by examining uniquely preserved cranio-dental material of a captive hybrid between grey and ringed seals. We analysed how distinct these species are genetically and morphologically, how easy it is to identify the hybrids using morphology and whether comparable hybridizations happen in the wild. We show that the genetic distance between these species is more than twice the modern human–Neanderthal distance, but still within that of morphologically similar species pairs known to hybridize. By contrast, morphological and developmental analyses show grey and ringed seals to be highly disparate, and that the hybrid is a predictable intermediate. Genetic analyses of the parent populations reveal introgression in the wild, suggesting that grey–ringed seal hybridization is not limited to captivity. Taken together, we postulate that there is considerable potential for mammalian hybridization between phenotypically disparate taxa.
- Published
- 2018
36. SOAP, cleaning multiple alignments from unstable blocks.
- Author
-
Ari Löytynoja and Michel C. Milinkovitch
- Published
- 2001
- Full Text
- View/download PDF
37. webPRANK: a phylogeny-aware multiple sequence aligner with interactive alignment browser.
- Author
-
Ari Löytynoja and Nick Goldman
- Published
- 2010
- Full Text
- View/download PDF
38. Wasabi: An Integrated Platform for Evolutionary Sequence Analysis and Data Visualization
- Author
-
Andres Veidenberg, Alan Medlar, and Ari Löytynoja
- Subjects
0301 basic medicine ,Source code ,Interface (Java) ,media_common.quotation_subject ,Context (language use) ,Web Browser ,Biology ,Bioinformatics ,Evolution, Molecular ,03 medical and health sciences ,Data visualization ,Genetics ,Ensembl ,Molecular Biology ,Phylogeny ,Ecology, Evolution, Behavior and Systematics ,media_common ,Comparative genomics ,Internet ,Genome ,Information retrieval ,business.industry ,Sequence Analysis, DNA ,Tree (data structure) ,030104 developmental biology ,Workflow ,business ,Sequence Alignment ,Sequence Analysis ,Algorithms ,Software - Abstract
Wasabi is an open source, web-based environment for evolutionary sequence analysis. Wasabi visualizes sequence data together with a phylogenetic tree within a modern, user-friendly interface: The interface hides extraneous options, supports context sensitive menus, drag-and-drop editing, and displays additional information, such as ancestral sequences, associated with specific tree nodes. The Wasabi environment supports reproducibility by automatically storing intermediate analysis steps and includes built-in functions to share data between users and publish analysis results. For computational analysis, Wasabi supports PRANK and PAGAN for phylogeny-aware alignment and alignment extension, and it can be easily extended with other tools. Along with drag-and-drop import of local files, Wasabi can access remote data through URL and import sequence data, GeneTrees and EPO alignments directly from Ensembl. To demonstrate a typical workflow using Wasabi, we reproduce key findings from recent comparative genomics studies, including a reanalysis of the EGLN1 gene from the tiger genome study: These case studies can be browsed within Wasabi at http://wasabiapp.org:8000?id=usecases. Wasabi runs inside a web browser and does not require any installation. One can start using it at http://wasabiapp.org. All source code is licensed under the AGPLv3.
- Published
- 2015
39. Bracketing phenotypic limits of mammalian hybridization
- Author
-
Olli-Pekka Smolander, Ian J. Corfe, Mia Valtonen, Jukka Jernvall, Yoland Savriama, Sylvain Gerber, Pasi Rastas, Liisa Holm, Lars Paulin, Ari Löytynoja, Annina Lyyski, Petri Auvinen, Teemu J. Häkkinen, Isaac Salazar-Ciudad, and Juhana Kammonen
- Subjects
0106 biological sciences ,0303 health sciences ,Morphological similarity ,Introgression ,Captivity ,Biology ,010603 evolutionary biology ,01 natural sciences ,Phenotype ,03 medical and health sciences ,Taxon ,Genetic distance ,Evolutionary biology ,Adaptive radiation ,030304 developmental biology ,Hybrid - Abstract
An increasing number of mammalian species have been shown to have a history of hybridization and introgression based on genetic analyses. Only relatively few fossils, however, preserve genetic material and morphology must be used to identify the species and determine whether morphologically intermediate fossils could represent hybrids. Because dental and cranial fossils are typically the key body parts studied in mammalian paleontology, here we bracket the potential for phenotypically extreme hybridizations by examining uniquely preserved cranio-dental material of a captive hybrid between gray and ringed seals. We analyzed how distinct these species are genetically and morphologically, how easy it is to identify the hybrids using morphology, and whether comparable hybridizations happen in the wild. We show that the genetic distance between these species is more than twice the modern human-Neanderthal distance, but still within that of morphologically similar species-pairs known to hybridize. In contrast, morphological and developmental analyses show gray and ringed seals to be highly disparate, and that the hybrid is a predictable intermediate. Genetic analyses of the parent populations reveal introgression in the wild, suggesting that gray-ringed seal hybridization is not limited to captivity. Taken together, gray and ringed seals appear to be in an adaptive radiation phase of evolution, showing large morphological differences relative to their comparatively modest genetic distance. Because morphological similarity does not always correlate with genetic distance in nature, we postulate that there is considerable potential for mammalian hybridization between phenotypically disparate taxa.
- Published
- 2018
- Full Text
- View/download PDF
40. Cleavage of the Drosophila screw prodomain is critical for a dynamic BMP morphogen gradient in embryogenesis
- Author
-
Petra M. Tauscher, Minh Nguyen, Jaana Künnapuu, Ari Löytynoja, Kavita Arora, Osamu Shimmi, and Nina Tiusanen
- Subjects
Post-translational regulation ,animal structures ,Embryo, Nonmammalian ,Molecular Sequence Data ,Embryonic Development ,Dorsal–ventral patterning ,Cleavage (embryo) ,Bone morphogenetic protein ,Ligands ,Bone morphogenetic protein 2 ,03 medical and health sciences ,0302 clinical medicine ,Transforming Growth Factor beta ,Animals ,Drosophila Proteins ,Blastoderm ,Amino Acid Sequence ,Enhancer ,Furin ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,biology ,Decapentaplegic ,Cell Biology ,Proprotein convertase ,Decapentaplegic (dpp) ,Cell biology ,Protein Structure, Tertiary ,Drosophila melanogaster ,Biochemistry ,Bone Morphogenetic Proteins ,Mutation ,biology.protein ,Mutant Proteins ,Protein Multimerization ,030217 neurology & neurosurgery ,Morphogen ,Signal Transduction ,Developmental Biology - Abstract
Dorsoventral patterning of the Drosophila embryo is regulated by graded distribution of bone morphogenetic proteins (BMPs) composed of two ligands, decapentaplegic (Dpp) a BMP2/4 ortholog and screw (Scw) a BMP5/6/7/8 family member. scwE1 encodes an unusual allele that was isolated as a dominant enhancer of partial loss-of-function mutations in dpp. However, the molecular mechanisms that underlie this genetic interaction remain to be addressed. Here we show that scwE1 contains a mutation at the furin cleavage site within the prodomain that is crucial for ligand production. Furthermore, our data show that ScwE1 preferentially forms heterodimers with Dpp rather than homotypic dimers, providing a possible explanation for the dominant negative phenotype of scwE1 alleles. The unprocessed prodomain of ScwE1 remains in a complex with the Dpp:Scw heterodimer, and thus could interfere with interaction of the ligand with the extracellular matrix, or the kinetics of processing/secretion of the ligand in vivo. These data reveal novel mechanisms by which post-translational regulation of Scw can modulate Dpp signaling activity.
- Published
- 2014
- Full Text
- View/download PDF
41. Glutton: large-scale integration of non-model organism transcriptome data for comparative analysis
- Author
-
Andreia Miraldo, Laura M. Laakso, Ari Löytynoja, and Alan Medlar
- Subjects
0106 biological sciences ,Genetics ,0303 health sciences ,Correctness ,Contig ,Heuristic ,Computational biology ,Biology ,010603 evolutionary biology ,01 natural sciences ,Set (abstract data type) ,Transcriptome ,03 medical and health sciences ,Reference data ,Gene ,030304 developmental biology ,Reference genome - Abstract
High-throughput RNA-seq data has become ubiquitous in the study of non-model organisms, but its use in comparative analysis remains a challenge. Without a reference genome for mapping, sequence data has to be de novo assembled, producing large numbers of short, highly redundant contigs. Preparing these assemblies for comparative analyses requires the removal of redundant isoforms, assignment of orthologs and converting fragmented transcripts into gene alignments. In this article we present Glutton, a novel tool to process transcriptome assemblies for downstream evolutionary analyses. Glutton takes as input a set of fragmented, possibly erroneous transcriptome assemblies. Utilising phylogeny-aware alignment and reference data from a closely related species, it reconstructs one transcript per gene, finds orthologous sequences and produces accurate multiple alignments of coding sequences. We present a comprehensive analysis of Glutton’s performance across a wide range of divergence times between study and reference species. We demonstrate the impact choice of assembler has on both the number of alignments and the correctness of ortholog assignment and show substantial improvements over heuristic methods, without sacrificing correctness. Finally, using inference of Darwinian selection as an example of downstream analysis, we show that Glutton-processed RNA-seq data give results comparable to those obtained from full length gene sequences even with distantly related reference species. Glutton is available from http://wasabiapp.org/software/glutton/ and is licensed under the GPLv3.
- Published
- 2016
- Full Text
- View/download PDF
42. Co-estimation of Phylogeny-aware Alignment and Phylogenetic Tree
- Author
-
Chunmei Li, Ari Löytynoja, and Alan Medlar
- Subjects
0106 biological sciences ,Alternative methods ,0303 health sciences ,Theoretical computer science ,Phylogenetic tree ,Sequence analysis ,business.industry ,Biology ,010603 evolutionary biology ,01 natural sciences ,Rendering (computer graphics) ,03 medical and health sciences ,Software ,Phylogenetics ,Iterative search ,business ,Algorithm ,Alignment-free sequence analysis ,030304 developmental biology - Abstract
The phylogeny-aware alignment algorithm implemented in both PRANK and PAGAN has been found to produce highly accurate alignments for comparative sequence analysis. However, the algorithm’s reliance on a guide tree during the alignment process can bias the resulting alignment rendering it unsuitable for phylogenetic inference. To overcome these issues, we have developed a new tool, Canopy, for parallelized iterative search of optimal alignment. Using Canopy, we studied the impact of the guide tree as well as the number and relative divergence of sequences on the accuracy of the alignment and inferred phylogeny. We find that PAGAN is the more robust of the two phylogeny-aware alignment methods to errors in the guide tree, but Canopy largely resolves the guide tree-related biases in PRANK. We demonstrate that, for all experimental settings tested, Canopy produces the most accurate sequence alignments and, further, that the inferred phylogenetic trees are of comparable accuracy to those obtained with the leading alternative method, SATé. Our analyses also show that, unlike traditional alignment algorithms, the phylogeny-aware algorithm effectively uses the information from denser sequence sampling and produces more accurate alignments when additional closely-related sequences are included. All methods are available for download at http://wasabiapp.org/software.
- Published
- 2016
43. Genome content of uncultivated marine Roseobacters in the surface ocean
- Author
-
Mary Ann Moran, Haiwei Luo, and Ari Löytynoja
- Subjects
Genetics ,Metagenomics ,Horizontal gene transfer ,Bacterioplankton ,Biology ,Roseobacter ,Clade ,biology.organism_classification ,Microbiology ,Gene ,Genome ,Ecology, Evolution, Behavior and Systematics ,Nucleotide diversity - Abstract
Summary Understanding of the ecological roles and evolution- ary histories of marine bacterial taxa can be compli- cated by mismatches in genome content between wild populations and their better-studied cultured relatives. We used computed patterns of non- synonymous (amino acid-altering) nucleotide diversity in marine metagenomic data to provide high-confidence identification of DNA fragments from uncultivated members of the Roseobacter clade, an abundant taxon of heterotrophic marine bacteri- oplankton in the world's oceans. Differences in gene stoichiometry in the Global Ocean Survey metage- nomic data set compared with 39 sequenced isolates indicated that natural Roseobacter populations differ systematically in several genomic attributes from their cultured representatives, including fewer genes for signal transduction and cell surface modifications but more genes for Sec-like protein secretion systems, anaplerotic CO2 incorporation, and phos- phorus and sulfate uptake. Several of these trends match well with characteristics previously identified as distinguishing r- versus K-selected ecological strategies in bacteria, suggesting that the r-strategist model assigned to cultured roseobacters may be less applicable to their free-living oceanic counterparts. The metagenomic Roseobacter DNA fragments revealed several traits with evolutionary histories suggestive of horizontal gene transfer from other marine bacterioplankton taxa or viruses, including pyrophosphatases and glycosylation proteins.
- Published
- 2011
44. A model of evolution and structure for multiple sequence alignment
- Author
-
Ari Löytynoja and Nick Goldman
- Subjects
Genetics ,Multiple sequence alignment ,Models, Genetic ,Sequence analysis ,insertion–deletion processes ,Structural alignment ,evolutionary process heterogeneity ,Probabilistic logic ,Sequence alignment ,Sequence Analysis, DNA ,Computational biology ,Biology ,Process substitution ,Sensitivity and Specificity ,General Biochemistry, Genetics and Molecular Biology ,Evolution, Molecular ,character homology ,Phylogenetics ,sequence alignment ,Computer Simulation ,General Agricultural and Biological Sciences ,Algorithms ,Phylogeny ,Alignment-free sequence analysis ,Research Article - Abstract
We have developed a phylogeny-aware progressive alignment method that recognizes insertions and deletions as distinct evolutionary events and thus avoids systematic errors created by traditional alignment methods. We now extend this method to simultaneously model regional heterogeneity and evolution. This novel method can be flexibly adapted to alignment of nucleotide or amino acid sequences evolving under processes that vary over genomic regions and, being fully probabilistic, provides an estimate of regional heterogeneity of the evolutionary process along the alignment and a measure of local reliability of the solution. Furthermore, the evolutionary modelling of substitution process permits adjusting the sensitivity and specificity of the alignment and, if high specificity is aimed at, leaving sequences unaligned when their divergence is beyond a meaningful detection of homology.
- Published
- 2008
45. Phylogeny-Aware Gap Placement Prevents Errors in Sequence Alignment and Evolutionary Analysis
- Author
-
Ari Löytynoja and Nick Goldman
- Subjects
Genetics ,Membrane Glycoproteins ,Multidisciplinary ,Multiple sequence alignment ,Phylogenetic tree ,HIV ,Sequence alignment ,Gap penalty ,Computational biology ,HIV Envelope Protein gp120 ,Biology ,Evolution, Molecular ,Mutagenesis, Insertional ,Viral Envelope Proteins ,Molecular evolution ,Phylogenetics ,HIV-1 ,Computer Simulation ,Simian Immunodeficiency Virus ,Sequence Alignment ,Algorithms ,Phylogeny ,Alignment-free sequence analysis ,Sequence Deletion ,Sequence (medicine) - Abstract
Genetic sequence alignment is the basis of many evolutionary and comparative studies, and errors in alignments lead to errors in the interpretation of evolutionary information in genomes. Traditional multiple sequence alignment methods disregard the phylogenetic implications of gap patterns that they create and infer systematically biased alignments with excess deletions and substitutions, too few insertions, and implausible insertion-deletion–event histories. We present a method that prevents these systematic errors by recognizing insertions and deletions as distinct evolutionary events. We show theoretically and practically that this improves the quality of sequence alignments and downstream analyses over a wide range of realistic alignment problems. These results suggest that insertions and sequence turnover are more common than is currently thought and challenge the conventional picture of sequence evolution and mechanisms of functional and structural changes.
- Published
- 2008
46. A recurrent copy number variation of the NEB triplicate region: only revealed by the targeted nemaline myopathy CGH array
- Author
-
Carina Wallgren-Pettersson, Ari Löytynoja, K. Kiiski, J. Laitila, Vilma-Lotta Lehtokari, Liina Ahlstén, and Katarina Pelin
- Subjects
0301 basic medicine ,DNA Copy Number Variations ,Chromosome Breakpoints ,Muscle Proteins ,Myopathies, Nemaline ,Article ,03 medical and health sciences ,Nebulin ,Exon ,0302 clinical medicine ,Nemaline myopathy ,Genetics ,medicine ,Humans ,Copy-number variation ,Allele ,Genetics (clinical) ,Comparative Genomic Hybridization ,biology ,Breakpoint ,medicine.disease ,3. Good health ,030104 developmental biology ,Case-Control Studies ,biology.protein ,030217 neurology & neurosurgery ,Comparative genomic hybridization - Abstract
Recently, new large variants have been identified in the nebulin gene (NEB) causing nemaline myopathy (NM). NM constitutes a heterogeneous group of disorders among the congenital myopathies, and disease-causing variants in NEB are a main cause of the recessively inherited form of NM. NEB consists of 183 exons and it includes homologous sequences such as a 32-kb triplicate region (TRI), where eight exons are repeated three times (exons 82-89, 90-97, 98-105). In human, the normal copy number of NEB TRI is six (three copies in each allele). Recently, we described a custom NM-CGH microarray designed to detect copy number variations (CNVs) in the known NM genes. The array has now been updated to include all the currently known 10 NM genes. The NM-CGH array is superior in detecting CNVs, especially of the NEB TRI, that is not included in the exome capture kits. To date, we have studied 266 samples from 196 NM families using the NM-CGH microarray, and identified a novel recurrent NEB TRI variation in 13% (26/196) of the families and in 10% of the controls (6/60). An analysis of the breakpoints revealed adjacent repeat elements, which are known to predispose for rearrangements such as CNVs. The control CNV samples deviate only one copy from the normal six copies, whereas the NM samples include CNVs of up to four additional copies. Based on this study, NEB seems to tolerate deviations of one TRI copy, whereas addition of two or more copies might be pathogenic.
- Published
- 2015
47. Tracking year-to-year changes in intestinal nematode communities of rufous mouse lemurs (Microcebus rufus)
- Author
-
Juha Laakkonen, Alan Medlar, Ari Löytynoja, Tuomas Aivelo, and Jukka Jernvall
- Subjects
Male ,Nematoda ,Lemur ,Intestinal parasite ,Parasitism ,medicine.disease_cause ,Feces ,Soay sheep ,biology.animal ,medicine ,Madagascar ,Parasite hosting ,Animals ,Nematode Infections ,Parasite Egg Count ,biology ,Host (biology) ,Ecology ,Community structure ,biology.organism_classification ,Infectious Diseases ,Nematode ,Animal Science and Zoology ,Parasitology ,Female ,Seasons ,Cheirogaleidae - Abstract
SUMMARYWhile it is known that intestinal parasite communities vary in their composition over time, there is a lack of studies addressing how variation in component communities (between-hosts) manifests in infracommunities (within-host) during the host lifespan. In this study, we investigate the changes in the intestinal parasite infracommunities in wild-living rufous mouse lemurs (Microcebus rufus) from Ranomafana National Park in southeastern Madagascar from 2010 to 2012. We used high-throughput barcoding of the 18S rRNA gene to interrogate parasite community structure. Our results show that in these nematode communities, there were two frequently occurring putative species and four rarer putative species. All putative species were randomly distributed over host individuals and they did not occur in clear temporal patterns. For the individuals caught in at least two different years, there was high turnover of putative species and high variation in fecal egg counts. Our study shows that while there was remarkable variation in infracommunities over time, the component community was relatively stable. Nevertheless, the patterns of prevalence varied substantially between years in each component community.
- Published
- 2015
48. Simple chained guide trees give poorer multiple sequence alignments than inferred trees in simulation and phylogenetic benchmarks
- Author
-
Nick Goldman, Christophe Dessimoz, Ge Tan, Ari Löytynoja, and Manuel Gil
- Subjects
0303 health sciences ,Sequence ,Multidisciplinary ,Phylogenetic tree ,572: Biochemie ,Inference ,Proteins ,Biology ,computer.software_genre ,Homologous Sequences ,03 medical and health sciences ,0302 clinical medicine ,Simple (abstract algebra) ,Sequence Analysis, Protein ,Data mining ,Letters ,Algorithm ,computer ,Sequence Alignment ,030217 neurology & neurosurgery ,Software ,030304 developmental biology - Abstract
Multiple sequence aligners typically work by progressively aligning the most closely related sequences or group of sequences according to guide trees. In PNAS, Boyce et al. (1) report that alignments reconstructed using simple chained trees (i.e., comb-like topologies) with random leaf assignment performed better in protein structure-based benchmarks than those reconstructed using phylogenies estimated from the data as guide trees. The authors state that this result could turn decades of research in the field on its head. In light of this statement, it is important to check immediately whether their result holds under evolutionary criteria: recovery of homologous sequence residues and inference of phylogenetic trees from the alignments (2). We have done this and the results are entirely opposed to Boyce et al.’s findings (1).
- Published
- 2015
49. Séance: reference-based phylogenetic analysis for 18S rRNA studies
- Author
-
Tuomas Aivelo, Ari Löytynoja, and Alan Medlar
- Subjects
Phylogenetic placement ,Computational biology ,Biology ,Marker gene ,18S ribosomal RNA ,18S community analysis ,Gene flow ,Workflow ,03 medical and health sciences ,0302 clinical medicine ,Phylogenetics ,RNA, Ribosomal, 18S ,Animals ,Cluster Analysis ,rRNA marker genes ,Parasites ,Longitudinal Studies ,Ecology, Evolution, Behavior and Systematics ,Phylogeny ,030304 developmental biology ,Genetics ,0303 health sciences ,Likelihood Functions ,Multiple sequence alignment ,Phylogenetic tree ,Lemur ,Phylogenetic network ,Amplicon ,Intestines ,030217 neurology & neurosurgery ,Software - Abstract
Background Marker gene studies often use short amplicons spanning one or more hypervariable regions from an rRNA gene to interrogate the community structure of uncultured environmental samples. Target regions are chosen for their discriminatory power, but the limited phylogenetic signal of short high-throughput sequencing reads precludes accurate phylogenetic analysis. This is particularly unfortunate in the study of microscopic eukaryotes where horizontal gene flow is limited and the rRNA gene is expected to accurately reflect the species phylogeny. A promising alternative to full phylogenetic analysis is phylogenetic placement, where a reference phylogeny is inferred using the complete marker gene and iteratively extended with the short sequences from a metagenetic sample under study. Results Based on the phylogenetic placement approach we built Séance, a community analysis pipeline focused on the analysis of 18S marker gene data. Séance combines the alignment extension and phylogenetic placement capabilities of the Pagan multiple sequence alignment program with a suite of tools to preprocess, cluster and visualise datasets composed of many samples. We showcase Séance by analysing 454 data from a longitudinal study of intestinal parasite communities in wild rufous mouse lemurs (Microcebus rufus) as well as in simulation. We demonstrate both improved OTU picking at higher levels of sequence similarity for 454 data and show the accuracy of phylogenetic placement to be comparable to maximum likelihood methods for lower numbers of taxa. Conclusions Séance is an open source community analysis pipeline that provides reference-based phylogenetic analysis for rRNA marker gene studies. Whilst in this article we focus on studying nematodes using the 18S marker gene, the concepts are generic and reference data for alternative marker genes can be easily created. Séance can be downloaded from http://wasabiapp.org/software/seance/. Electronic supplementary material The online version of this article (doi:10.1186/s12862-014-0235-7) contains supplementary material, which is available to authorized users.
- Published
- 2014
50. Phylogeny-aware alignment with PRANK
- Author
-
Ari, Löytynoja
- Subjects
Evolution, Molecular ,Base Sequence ,Computational Biology ,Reproducibility of Results ,Sequence Alignment ,Algorithms ,Phylogeny ,Software - Abstract
Evolutionary analyses require sequence alignments that correctly represent evolutionary homology. Evolutionary and structural homology are not the same and sequence alignments generated with methods designed for structural matching can be seriously misleading in comparative and phylogenetic analyses. The phylogeny-aware alignment algorithm implemented in the program PRANK has been shown to produce good alignments for evolutionary inferences. Unlike other alignment programs, PRANK makes use of phylogenetic information to distinguish alignment gaps caused by insertions or deletions and, thereafter, handles the two types of events differently. As a by-product of the correct handling of insertions and deletions, PRANK can provide the inferred ancestral sequences as a part of the output and mark the alignment gaps differently depending on their origin in insertion or deletion events. As the algorithm infers the evolutionary history of the sequences, PRANK can be sensitive to errors in the guide phylogeny and violations on the underlying assumptions about the origin and patterns of gaps. These issues are discussed in detail and practical advice for the use of PRANK in evolutionary analysis is provided. The PRANK software and other methods discussed here can be found from the program home page at http://code.google.com/p/prank-msa/.
- Published
- 2013
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.