21 results on '"Weilguny, Lukas"'
Search Results
2. phastSim: Efficient simulation of sequence evolution for pandemic-scale datasets
- Author
-
De Maio, Nicola, Boulton, William, Weilguny, Lukas, Walker, Conor R, Turakhia, Yatish, Corbett-Detig, Russell, and Goldman, Nick
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Evolutionary Biology ,Genetics ,Information and Computing Sciences ,Applied Computing ,Networking and Information Technology R&D (NITRD) ,Human Genome ,1.4 Methodologies and measurements ,Underpinning research ,Generic health relevance ,Algorithms ,COVID-19 ,Computer Simulation ,Evolution ,Molecular ,Humans ,Pandemics ,Phylogeny ,SARS-CoV-2 ,Software ,Mathematical Sciences ,Bioinformatics - Abstract
Sequence simulators are fundamental tools in bioinformatics, as they allow us to test data processing and inference tools, and are an essential component of some inference methods. The ongoing surge in available sequence data is however testing the limits of our bioinformatics software. One example is the large number of SARS-CoV-2 genomes available, which are beyond the processing power of many methods, and simulating such large datasets is also proving difficult. Here, we present a new algorithm and software for efficiently simulating sequence evolution along extremely large trees (e.g. > 100, 000 tips) when the branches of the tree are short, as is typical in genomic epidemiology. Our algorithm is based on the Gillespie approach, and it implements an efficient multi-layered search tree structure that provides high computational efficiency by taking advantage of the fact that only a small proportion of the genome is likely to mutate at each branch of the considered phylogeny. Our open source software allows easy integration with other Python packages as well as a variety of evolutionary models, including indel models and new hypermutability models that we developed to more realistically represent SARS-CoV-2 genome evolution.
- Published
- 2022
3. Dynamic, adaptive sampling during nanopore sequencing using Bayesian experimental design
- Author
-
Weilguny, Lukas, De Maio, Nicola, Munro, Rory, Manser, Charlotte, Birney, Ewan, Loose, Matthew, and Goldman, Nick
- Published
- 2023
- Full Text
- View/download PDF
4. Stability of SARS-CoV-2 phylogenies.
- Author
-
Turakhia, Yatish, De Maio, Nicola, Thornlow, Bryan, Gozashti, Landen, Lanfear, Robert, Walker, Conor R, Hinrichs, Angie S, Fernandes, Jason D, Borges, Rui, Slodkowicz, Greg, Weilguny, Lukas, Haussler, David, Goldman, Nick, and Corbett-Detig, Russell
- Subjects
Genetics ,Developmental Biology - Abstract
The SARS-CoV-2 pandemic has led to unprecedented, nearly real-time genetic tracing due to the rapid community sequencing response. Researchers immediately leveraged these data to infer the evolutionary relationships among viral samples and to study key biological questions, including whether host viral genome editing and recombination are features of SARS-CoV-2 evolution. This global sequencing effort is inherently decentralized and must rely on data collected by many labs using a wide variety of molecular and bioinformatic techniques. There is thus a strong possibility that systematic errors associated with lab-or protocol-specific practices affect some sequences in the repositories. We find that some recurrent mutations in reported SARS-CoV-2 genome sequences have been observed predominantly or exclusively by single labs, co-localize with commonly used primer binding sites and are more likely to affect the protein-coding sequences than other similarly recurrent mutations. We show that their inclusion can affect phylogenetic inference on scales relevant to local lineage tracing, and make it appear as though there has been an excess of recurrent mutation or recombination among viral lineages. We suggest how samples can be screened and problematic variants removed, and we plan to regularly inform the scientific community with our updated results as more SARS-CoV-2 genome sequences are shared (https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473 and https://virological.org/t/masking-strategies-for-sars-cov-2-alignments/480). We also develop tools for comparing and visualizing differences among very large phylogenies and we show that consistent clade- and tree-based comparisons can be made between phylogenies produced by different groups. These will facilitate evolutionary inferences and comparisons among phylogenies produced for a wide array of purposes. Building on the SARS-CoV-2 Genome Browser at UCSC, we present a toolkit to compare, analyze and combine SARS-CoV-2 phylogenies, find and remove potential sequencing errors and establish a widely shared, stable clade structure for a more accurate scientific inference and discourse.
- Published
- 2020
5. High-speed volumetric imaging of neuronal activity in freely moving rodents
- Author
-
Skocek, Oliver, Nöbauer, Tobias, Weilguny, Lukas, Martínez Traub, Francisca, Xia, Chuying Naomi, Molodtsov, Maxim I, Grama, Abhinav, Yamagata, Masahito, Aharoni, Daniel, Cox, David D, Golshani, Peyman, and Vaziri, Alipasha
- Subjects
Neurosciences ,Neurological ,Animals ,Hippocampus ,Intravital Microscopy ,Mice ,Miniaturization ,Neurons ,Optical Imaging ,Biological Sciences ,Technology ,Medical and Health Sciences ,Developmental Biology - Abstract
Thus far, optical recording of neuronal activity in freely behaving animals has been limited to a thin axial range. We present a head-mounted miniaturized light-field microscope (MiniLFM) capable of capturing neuronal network activity within a volume of 700 × 600 × 360 µm3 at 16 Hz in the hippocampus of freely moving mice. We demonstrate that neurons separated by as little as ~15 µm and at depths up to 360 µm can be discriminated.
- Published
- 2018
6. Soil carbon in the world's tidal marshes
- Author
-
Maxwell, Tania L, primary, Spalding, Mark D, additional, Friess, Daniel A, additional, Murray, Nicholas J, additional, Rogers, Kerrylee, additional, Rovai, Andre S, additional, Smart, Lindsey S, additional, Weilguny, Lukas, additional, Adame, Maria Fernanda, additional, Adams, Janine B, additional, Copertino, Margareth S, additional, Cott, Grace M, additional, Duarte de Paula Costa, Micheli, additional, Holmquist, James R, additional, Ladd, Cai J T, additional, Lovelock, Catherine, additional, Ludwig, Marvin, additional, Moritsch, Monica M, additional, Navarro, Alejandro, additional, Raw, Jacqueline L, additional, Ruiz-Fernandez, Ana-Carolina, additional, Serrano, Oscar, additional, Smeaton, Craig, additional, Van de Broek, Marijn, additional, Windham-Myers, Lisamarie, additional, Landis, Emily, additional, and Worthington, Thomas A, additional
- Published
- 2024
- Full Text
- View/download PDF
7. Video rate volumetric Ca2+ imaging across cortex using seeded iterative demixing (SID) microscopy
- Author
-
Nöbauer, Tobias, Skocek, Oliver, Pernía-Andrade, Alejandro J, Weilguny, Lukas, Traub, Francisca Martínez, Molodtsov, Maxim I, and Vaziri, Alipasha
- Published
- 2017
- Full Text
- View/download PDF
8. Reconstructing the Invasion Route of the P-Element in Drosophila melanogaster Using Extant Population Samples
- Author
-
Weilguny, Lukas, Vlachos, Christos, Selvaraju, Divya, and Kofler, Robert
- Subjects
AcademicSubjects/SCI01140 ,Gene Flow ,Models, Genetic ,AcademicSubjects/SCI01130 ,population genetics ,bioinformatics ,Drosophila melanogaster ,DNA Transposable Elements ,P-element ,Animals ,Drosophila ,transposable elements ,Research Article ,Sequence Deletion - Abstract
The P-element, one of the best understood eukaryotic transposable elements, spread in natural Drosophila melanogaster populations in the last century. It invaded American populations first and later spread to the Old World. Inferring this invasion route was made possible by a unique resource available in D. melanogaster: Many strains sampled from different locations over the course of the last century. Here, we test the hypothesis that the invasion route of the P-element may be reconstructed from extant population samples using internal deletions (IDs) as markers. These IDs arise at a high rate when DNA transposons, such as the P-element, are active. We suggest that inferring invasion routes is possible as: 1) the fraction of IDs increases in successively invaded populations, which also explains the striking differences in the ID content between American and European populations, and 2) successively invaded populations end up with similar sets of IDs. This approach allowed us to reconstruct the invasion route of the P-element with reasonable accuracy. Our approach also sheds light on the unknown timing of the invasion in African populations: We suggest that African populations were invaded after American but before European populations. Simulations of TE invasions in spatially distributed populations confirm that IDs may allow us to infer invasion routes. Our approach might be applicable to other DNA transposons in different host species.
- Published
- 2020
9. Author Correction: High-speed volumetric imaging of neuronal activity in freely moving rodents
- Author
-
Skocek, Oliver, Nöbauer, Tobias, Weilguny, Lukas, Traub, Francisca Martínez, Xia, Chuying Naomi, Molodtsov, Maxim I., Grama, Abhinav, Yamagata, Masahito, Aharoni, Daniel, Cox, David D., Golshani, Peyman, and Vaziri, Alipasha
- Published
- 2018
- Full Text
- View/download PDF
10. phastSim: efficient simulation of sequence evolution for pandemic-scale datasets
- Author
-
De Maio, Nicola, primary, Boulton, William, additional, Weilguny, Lukas, additional, Walker, Conor R., additional, Turakhia, Yatish, additional, Corbett-Detig, Russell, additional, and Goldman, Nick, additional
- Published
- 2021
- Full Text
- View/download PDF
11. Sea anemone genomes reveal ancestral metazoan chromosomal macrosynteny
- Author
-
Zimmermann, Bob, primary, Robb, Sofia M.C., additional, Genikhovich, Grigory, additional, Fropf, Whitney J., additional, Weilguny, Lukas, additional, He, Shuonan, additional, Chen, Shiyuan, additional, Lovegrove-Walsh, Jessica, additional, Hill, Eric M., additional, Chen, Cheng-Yi, additional, Ragkousi, Katerina, additional, Praher, Daniela, additional, Fredman, David, additional, Moran, Yehu, additional, Gibson, Matthew C., additional, and Technau, Ulrich, additional
- Published
- 2020
- Full Text
- View/download PDF
12. Reconstructing the Invasion Route of the P-Element inDrosophila melanogasterUsing Extant Population Samples
- Author
-
Weilguny, Lukas, primary, Vlachos, Christos, additional, Selvaraju, Divya, additional, and Kofler, Robert, additional
- Published
- 2020
- Full Text
- View/download PDF
13. Dynamic, adaptive sampling during nanopore sequencing using Bayesian experimental design
- Author
-
Weilguny, Lukas, primary, De Maio, Nicola, additional, Munro, Rory, additional, Manser, Charlotte, additional, Birney, Ewan, additional, Loose, Matt, additional, and Goldman, Nick, additional
- Published
- 2020
- Full Text
- View/download PDF
14. Development of a novel tool to uncover mobile genetic element diversity and trace the invasion of DNA transposons
- Author
-
Weilguny, Lukas
- Abstract
Transposons (TEs) sind egoistische DNA Sequenzen, die sich in ihrem Wirtsgenom vervielfachen können. Sie wurden in den meisten Spezies, die bisher untersucht wurden, gefunden und weisen einen höchst unterschiedlichen Grad an Häufigkeit und Sequenzverschiedenheit auf. Die Zusammensetzung von TEs kann aber nicht nur zwischen, sondern auch innerhalb von Spezies variieren und wichtige biologische Konsequenzen nach sich ziehen. Unterschiede im Vorkommen innerhalb von Populationen könnten beispielsweise auf eine Invasion eines Transposons hinweisen, wohingegen Variation in der Sequenz das Vorhandensein von hyperaktiven oder inaktiven Varianten bedeuten könnte. Um die evolutionäre Dynamik von Transposons zu verstehen, ist es deshalb wichtig unverzerrte Schätzwerte für die Zusammensetzung von TEs zu erhalten. Deshalb haben wir DeviaTE entwickelt; ein Programm zur Analyse und Visualisierung von TE Häufigkeit mit Illumina- oder Sanger-sequenzierten DNA-Abschnitten. Unser Werkzeug benötigt lediglich sequenzierte DNA-Abschnitte und Prototypsequenzen von TEs. Damit funktioniert es ohne Gesamtsequenz eines Genoms, was die Anwedung bei Nichtmodellorganismen, für die es bisher keine hoch qualitative Gesamtsequenz gibt, ermöglicht. DeviaTE erstellt eine Tabelle und eine Visualisierung der TE Struktur und liefert unverzerrte Schätzwerte für die TE Häufigkeit. Mit bereits publizierten Daten zeigen wir, dass DeviaTE benutzt werden kann um die Zusammensetzung von Transposons in Stichproben zu untersuchen, geographische Variation in TEs festzustellen oder die Verschiedenartigkeit von TEs zwischen Spezies zu ermitteln. Zusätzlich präsentieren wir eine gründliche Validierung mit simulierten Daten. Darüber hinaus beschreiben wir eine Modell für Invasionen von DNA TEs und eine Methode um den Ablauf von solchen Invasionen mit unserem neuen Programm zu rekonstruieren. Wir argumentieren, dass eine Invasion einzigartige Fingerabdrücke in Populationen hinterlässt, die aus nicht-autonomen Varianten von TEs mit Deletionen inmitten ihrer DNA Sequenz, besteht. Mithilfe dieser TE Relikte zeigen wir, dass die Abfolge der P-element Invasion in Nordamerikanischen und Europäischen Drosophila melanogaster Populationen nachgezeichnet werden kann. Wir stellen fest, dass die Muster von Varianten mit deletierten Sequenzabschnitten die geographische Verteilung der untersuchten Populationen widerspiegeln. Zusätzlich ermitteln wir mögliche Ausgangspunkte und Routen für die Ausbreitung auf beiden Kontinenten. Mit der Entwicklung von DeviaTE hoffen wir, Fortschritte im Verständnis der Dynamik von TE Invasionen und anderer Prozesse, in denen TEs eine wichtige Rolle spielen, zu ermöglichen., Transposable elements (TEs) are selfish DNA sequences that multiply within host genomes. They are present in most species investigated so far at varying degrees of abundance and sequence diversity. The TE composition may not only vary between but also within species and could have important biological implications. Variation in prevalence among populations may for example indicate a recent TE invasion, whereas sequence variation could indicate the presence of hyperactive or inactive forms. Gaining unbiased estimates of TE composition is thus vital for understanding the evolutionary dynamics of transposons. To this end we developed DeviaTE, a tool to analyze and visualize TE abundance using Illumina or Sanger reads. Our program only requires sequencing reads and consensus sequences of TEs. Thus, it works in an assembly-free manner, increasing its applicability to non-model organisms for which a high-quality assembly is not available yet. It generates a table and a visual representation of TE composition and provides unbiased estimates of TE abundance. Using published data we demonstrate that DeviaTE can be used to study TE composition within samples, identify clinal variation in TEs or compare TE diversity among species. We also present careful validation with simulated data. Moreover, we describe a model of DNA transposon invasions and an approach to reconstruct the history of such invasions using our novel tool. We propose that an invasion leaves unique fingerprints within populations, which consist of non-autonomous, internally deleted variants of TEs. Using these TE remnants, we show that the sequence of the P-element invasion in North American and European Drosophila melanogaster populations can be retraced. In particular, we find that patterns of internally deleted variants recover the geographic distribution of sampled populations. Additionally, we identify potential origins and routes of the invasion on both continents. With the development of DeviaTE we hope to catalyze future progress in our understanding of TE invasion dynamics and other diverse phenomena, in which TEs play a central role.
- Published
- 2019
- Full Text
- View/download PDF
15. Reconstructing the invasion route of DNA transposons using extant population samples
- Author
-
Weilguny, Lukas, primary, Vlachos, Christos, additional, Selvaraju, Divya, additional, and Kofler, Robert, additional
- Published
- 2019
- Full Text
- View/download PDF
16. DeviaTE: Assembly‐free analysis and visualization of mobile genetic element composition
- Author
-
Weilguny, Lukas, primary and Kofler, Robert, additional
- Published
- 2019
- Full Text
- View/download PDF
17. Video rate volumetric Ca2+ imaging across cortical layers using Seeded Iterative Demixing (SID) microscopy
- Author
-
Nöbauer, Tobias, primary, Skocek, Oliver, additional, Pernía-Andrade, Alejandro J., additional, Weilguny, Lukas, additional, Traub, Francisca Martinez, additional, Molodtsov, Maxim I., additional, and Vaziri, Alipasha, additional
- Published
- 2017
- Full Text
- View/download PDF
18. Video rate volumetric Ca2+imaging across cortex using seeded iterative demixing (SID) microscopy
- Author
-
Nöbauer, Tobias, Skocek, Oliver, Pernía-Andrade, Alejandro J, Weilguny, Lukas, Traub, Francisca Martínez, Molodtsov, Maxim I, and Vaziri, Alipasha
- Abstract
The seeded iterative demixing strategy, when used in combination with light-field microscopy, enables calcium imaging at single-neuron resolution in the mouse brain at high volumetric imaging rates and depths of up to 380 μm.
- Published
- 2017
- Full Text
- View/download PDF
19. phastSim: Efficient simulation of sequence evolution for pandemic-scale datasets
- Author
-
Nick Goldman, Conor R Walker, De Maio N, Russell Corbett-Detig, Lukas Weilguny, Yatish Turakhia, De Maio, Nicola [0000-0002-1776-8564], Boulton, William [0000-0002-8258-4673], Weilguny, Lukas [0000-0001-6459-0431], Walker, Conor R [0000-0001-5617-5086], Corbett-Detig, Russell [0000-0001-6535-2478], Goldman, Nick [0000-0001-8486-2211], Apollo - University of Cambridge Repository, and Mustonen, Ville
- Subjects
FOS: Computer and information sciences ,Computer science ,Inference ,Engineering and technology ,computer.software_genre ,Mathematical Sciences ,Software ,Phylogeny ,computer.programming_language ,Sequence ,Genome ,Computer and information sciences ,Ecology ,Biological Sciences ,1.4 Methodologies and measurements ,Tree (data structure) ,Networking and Information Technology R&D ,Computational Theory and Mathematics ,Modeling and Simulation ,Data mining ,Simulation ,Algorithms ,Research Article ,Genome evolution ,Evolution ,Bioinformatics ,Article ,Evolution, Molecular ,Cellular and Molecular Neuroscience ,Underpinning research ,Information and Computing Sciences ,Genetics ,Humans ,Computer Simulation ,Selection ,Molecular Biology ,Pandemics ,Ecology, Evolution, Behavior and Systematics ,Medicine and health sciences ,Biology and life sciences ,business.industry ,SARS-CoV-2 ,Human Genome ,Molecular ,COVID-19 ,Python (programming language) ,FOS: Engineering and technology ,Search tree ,Research and analysis methods ,Mutation ,Generic health relevance ,business ,computer ,Test data - Abstract
Funder: European Molecular Biology Laboratory; funder-id: http://dx.doi.org/10.13039/100013060, Funder: Schmidt Futures Foundation, Sequence simulators are fundamental tools in bioinformatics, as they allow us to test data processing and inference tools, and are an essential component of some inference methods. The ongoing surge in available sequence data is however testing the limits of our bioinformatics software. One example is the large number of SARS-CoV-2 genomes available, which are beyond the processing power of many methods, and simulating such large datasets is also proving difficult. Here, we present a new algorithm and software for efficiently simulating sequence evolution along extremely large trees (e.g. > 100, 000 tips) when the branches of the tree are short, as is typical in genomic epidemiology. Our algorithm is based on the Gillespie approach, and it implements an efficient multi-layered search tree structure that provides high computational efficiency by taking advantage of the fact that only a small proportion of the genome is likely to mutate at each branch of the considered phylogeny. Our open source software allows easy integration with other Python packages as well as a variety of evolutionary models, including indel models and new hypermutability models that we developed to more realistically represent SARS-CoV-2 genome evolution.
- Published
- 2022
- Full Text
- View/download PDF
20. Stability of SARS-CoV-2 phylogenies
- Author
-
Landen Gozashti, Robert Lanfear, Angie S. Hinrichs, Lukas Weilguny, David Haussler, Greg Slodkowicz, Bryan Thornlow, Rui Borges, Conor R Walker, Yatish Turakhia, Nick Goldman, Russell Corbett-Detig, Nicola De Maio, Jason D Fernandes, Barsh, Gregory S, Turakhia, Yatish [0000-0001-5600-2900], De Maio, Nicola [0000-0002-1776-8564], Thornlow, Bryan [0000-0001-6334-5186], Walker, Conor R [0000-0001-5617-5086], Hinrichs, Angie S [0000-0002-1697-1130], Fernandes, Jason D [0000-0002-8625-1796], Borges, Rui [0000-0002-5905-3778], Slodkowicz, Greg [0000-0001-6918-0386], Weilguny, Lukas [0000-0001-6459-0431], Haussler, David [0000-0003-1533-4575], Corbett-Detig, Russell [0000-0001-6535-2478], and Apollo - University of Cambridge Repository
- Subjects
RNA viruses ,Cancer Research ,Coronaviruses ,Genome browser ,QH426-470 ,Genome ,Trees ,0302 clinical medicine ,Genome editing ,Viral ,Clade ,Pathology and laboratory medicine ,Phylogeny ,Genetics (clinical) ,Data Management ,0303 health sciences ,Microbial Mutation ,Eukaryota ,Phylogenetic Analysis ,Genomics ,Medical microbiology ,Plants ,Phylogenetics ,Infectious Diseases ,Viral evolution ,Viruses ,RNA, Viral ,SARS CoV 2 ,Pathogens ,Infection ,Algorithms ,Research Article ,Computer and Information Sciences ,SARS coronavirus ,Evolution ,Genome, Viral ,Computational biology ,Biology ,Microbiology ,Viral Evolution ,Evolution, Molecular ,Vaccine Related ,03 medical and health sciences ,Virology ,Biodefense ,Genetics ,Humans ,Evolutionary Systematics ,Molecular Biology ,Alleles ,Ecology, Evolution, Behavior and Systematics ,Taxonomy ,030304 developmental biology ,Medicine and health sciences ,Whole genome sequencing ,Evolutionary Biology ,Whole Genome Sequencing ,SARS-CoV-2 ,Prevention ,Human Genome ,Organisms ,Viral pathogens ,Biology and Life Sciences ,Computational Biology ,Molecular ,COVID-19 ,Organismal Evolution ,Microbial pathogens ,Emerging Infectious Diseases ,Genetic Loci ,Microbial Evolution ,RNA ,Sequence Alignment ,030217 neurology & neurosurgery ,Developmental Biology - Abstract
Funder: Alfred P. Sloan Foundation; funder-id: http://dx.doi.org/10.13039/100000879, Funder: European Molecular Biology Laboratory (EMBL), The SARS-CoV-2 pandemic has led to unprecedented, nearly real-time genetic tracing due to the rapid community sequencing response. Researchers immediately leveraged these data to infer the evolutionary relationships among viral samples and to study key biological questions, including whether host viral genome editing and recombination are features of SARS-CoV-2 evolution. This global sequencing effort is inherently decentralized and must rely on data collected by many labs using a wide variety of molecular and bioinformatic techniques. There is thus a strong possibility that systematic errors associated with lab-or protocol-specific practices affect some sequences in the repositories. We find that some recurrent mutations in reported SARS-CoV-2 genome sequences have been observed predominantly or exclusively by single labs, co-localize with commonly used primer binding sites and are more likely to affect the protein-coding sequences than other similarly recurrent mutations. We show that their inclusion can affect phylogenetic inference on scales relevant to local lineage tracing, and make it appear as though there has been an excess of recurrent mutation or recombination among viral lineages. We suggest how samples can be screened and problematic variants removed, and we plan to regularly inform the scientific community with our updated results as more SARS-CoV-2 genome sequences are shared (https://virological.org/t/issues-with-sars-cov-2-sequencing-data/473 and https://virological.org/t/masking-strategies-for-sars-cov-2-alignments/480). We also develop tools for comparing and visualizing differences among very large phylogenies and we show that consistent clade- and tree-based comparisons can be made between phylogenies produced by different groups. These will facilitate evolutionary inferences and comparisons among phylogenies produced for a wide array of purposes. Building on the SARS-CoV-2 Genome Browser at UCSC, we present a toolkit to compare, analyze and combine SARS-CoV-2 phylogenies, find and remove potential sequencing errors and establish a widely shared, stable clade structure for a more accurate scientific inference and discourse.
- Published
- 2020
21. phastSim: efficient simulation of sequence evolution for pandemic-scale datasets.
- Author
-
De Maio N, Boulton W, Weilguny L, Walker CR, Turakhia Y, Corbett-Detig R, and Goldman N
- Abstract
Sequence simulators are fundamental tools in bioinformatics, as they allow us to test data processing and inference tools, as well as being part of some inference methods. The ongoing surge in available sequence data is however testing the limits of our bioinformatics software. One example is the large number of SARS-CoV-2 genomes available, which are beyond the processing power of many methods, and simulating such large datasets is also proving difficult. Here we present a new algorithm and software for efficiently simulating sequence evolution along extremely large trees (e.g. > 100,000 tips) when the branches of the tree are short, as is typical in genomic epidemiology. Our algorithm is based on the Gillespie approach, and implements an efficient multi-layered search tree structure that provides high computational efficiency by taking advantage of the fact that only a small proportion of the genome is likely to mutate at each branch of the considered phylogeny. Our open source software is available from https://github.com/NicolaDM/phastSim and allows easy integration with other Python packages as well as a variety of evolutionary models, including indel models and new hypermutatability models that we developed to more realistically represent SARS-CoV-2 genome evolution.
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.