231 results on '"Koonin, Eugene V."'
Search Results
2. Discovery of Diverse CRISPR-Cas Systems and Expansion of the Genome Engineering Toolbox
- Author
-
Koonin, Eugene V., Gootenberg, Jonathan S., and Abudayyeh, Omar O.
- Abstract
CRISPR systems mediate adaptive immunity in bacteria and archaea through diverse effector mechanisms and have been repurposed for versatile applications in therapeutics and diagnostics thanks to their facile reprogramming with RNA guides. RNA-guided CRISPR-Cas targeting and interference are mediated by effectors that are either components of multisubunit complexes in class 1 systems or multidomain single-effector proteins in class 2. The compact class 2 CRISPR systems have been broadly adopted for multiple applications, especially genome editing, leading to a transformation of the molecular biology and biotechnology toolkit. The diversity of class 2 effector enzymes, initially limited to the Cas9 nuclease, was substantially expanded via computational genome and metagenome mining to include numerous variants of Cas12 and Cas13, providing substrates for the development of versatile, orthogonal molecular tools. Characterization of these diverse CRISPR effectors uncovered many new features, including distinct protospacer adjacent motifs (PAMs) that expand the targeting space, improved editing specificity, RNA rather than DNA targeting, smaller crRNAs, staggered and blunt end cuts, miniature enzymes, promiscuous RNA and DNA cleavage, etc. These unique properties enabled multiple applications, such as harnessing the promiscuous RNase activity of the type VI effector, Cas13, for supersensitive nucleic acid detection. class 1 CRISPR systems have been adopted for genome editing, as well, despite the challenge of expressing and delivering the multiprotein class 1 effectors. The rich diversity of CRISPR enzymes led to rapid maturation of the genome editing toolbox, with capabilities such as gene knockout, base editing, prime editing, gene insertion, DNA imaging, epigenetic modulation, transcriptional modulation, and RNA editing. Combined with rational design and engineering of the effector proteins and associated RNAs, the natural diversity of CRISPR and related bacterial RNA-guided systems provides a vast resource for expanding the repertoire of tools for molecular biology and biotechnology.
- Published
- 2023
- Full Text
- View/download PDF
3. Structural atlas of a human gut crassvirus
- Author
-
Bayfield, Oliver W., Shkoporov, Andrey N., Yutin, Natalya, Khokhlova, Ekaterina V., Smith, Jake L. R., Hawkins, Dorothy E. D. P., Koonin, Eugene V., Hill, Colin, and Antson, Alfred A.
- Abstract
CrAssphage and related viruses of the order Crassvirales(hereafter referred to as crassviruses) were originally discovered by cross-assembly of metagenomic sequences. They are the most abundant viruses in the human gut, are found in the majority of individual gut viromes, and account for up to 95% of the viral sequences in some individuals1–4. Crassviruses are likely to have major roles in shaping the composition and functionality of the human microbiome, but the structures and roles of most of the virally encoded proteins are unknown, with only generic predictions resulting from bioinformatic analyses4,5. Here we present a cryo-electron microscopy reconstruction of Bacteroides intestinalisvirus ΦcrAss0016, providing the structural basis for the functional assignment of most of its virion proteins. The muzzle protein forms an assembly about 1 MDa in size at the end of the tail and exhibits a previously unknown fold that we designate the ‘crass fold’, that is likely to serve as a gatekeeper that controls the ejection of cargos. In addition to packing the approximately 103 kb of virus DNA, the ΦcrAss001 virion has extensive storage space for virally encoded cargo proteins in the capsid and, unusually, within the tail. One of the cargo proteins is present in both the capsid and the tail, suggesting a general mechanism for protein ejection, which involves partial unfolding of proteins during their extrusion through the tail. These findings provide a structural basis for understanding the mechanisms of assembly and infection of these highly abundant crassviruses.
- Published
- 2023
- Full Text
- View/download PDF
4. Learn from the past to predict viral pandemics
- Author
-
Rochman, Nash D. and Koonin, Eugene V.
- Abstract
The COVID-19 pandemic highlighted the need to understand the emergence of viral variants, given that these can have implications for vaccination success. A bioinformatics tool offers a way to predict viral evolution.
- Published
- 2023
- Full Text
- View/download PDF
5. The logic of virus evolution.
- Author
-
Koonin, Eugene V., Dolja, Valerian V., and Krupovic, Mart
- Abstract
Viruses are obligate intracellular parasites. Despite their dependence on host cells, viruses are evolutionarily autonomous, with their own genomes and evolutionary trajectories locked in arms races with the hosts. Here, we discuss a simple functional logic to explain virus macroevolution that appears to define the course of virus evolution. A small core of virus hallmark genes that are responsible for genome replication apparently descended from primordial replicators, whereas most virus genes, starting with those encoding capsid proteins, were subsequently acquired from hosts. The oldest of these acquisitions antedate the last universal cellular ancestor (LUCA). Host gene capture followed two major routes: convergent recruitment of genes with functions that directly benefit virus reproduction and exaptation when host proteins are repurposed for unique virus functions. These forms of host protein recruitment by viruses result in different levels of similarity between virus and host homologs, with the exapted ones often changing beyond easy recognition. In this perspective, Koonin, Dolja, and Krupovic examine the capture of host genes by viruses and how this process shaped virus genomes through the ∼4 billion years of virus-host coevolution. The logic of repurposing cellular proteins determines the trajectories of virus macroevolution including the emergence of new groups of viruses. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
6. Structure of the OMEGA nickase IsrB in complex with ωRNA and target DNA
- Author
-
Hirano, Seiichi, Kappel, Kalli, Altae-Tran, Han, Faure, Guilhem, Wilkinson, Max E., Kannan, Soumya, Demircioglu, F. Esra, Yan, Rui, Shiozaki, Momoko, Yu, Zhiheng, Makarova, Kira S., Koonin, Eugene V., Macrae, Rhiannon K., and Zhang, Feng
- Abstract
RNA-guided systems, such as CRISPR–Cas, combine programmable substrate recognition with enzymatic function, a combination that has been used advantageously to develop powerful molecular technologies1,2. Structural studies of these systems have illuminated how the RNA and protein jointly recognize and cleave their substrates, guiding rational engineering for further technology development3. Recent work identified a new class of RNA-guided systems, termed OMEGA, which include IscB, the likely ancestor of Cas9, and the nickase IsrB, a homologue of IscB lacking the HNH nuclease domain4. IsrB consists of only around 350 amino acids, but its small size is counterbalanced by a relatively large RNA guide (roughly 300-nt ωRNA). Here, we report the cryogenic-electron microscopy structure of Desulfovirgula thermocuniculiIsrB (DtIsrB) in complex with its cognate ωRNA and a target DNA. We find the overall structure of the IsrB protein shares a common scaffold with Cas9. In contrast to Cas9, however, which uses a recognition (REC) lobe to facilitate target selection, IsrB relies on its ωRNA, part of which forms an intricate ternary structure positioned analogously to REC. Structural analyses of IsrB and its ωRNA as well as comparisons to other RNA-guided systems highlight the functional interplay between protein and RNA, advancing our understanding of the biology and evolution of these diverse systems.
- Published
- 2022
- Full Text
- View/download PDF
7. Fishing for phages in metagenomes: what do we catch, what do we miss?
- Author
-
Benler, Sean and Koonin, Eugene V
- Abstract
Metagenomics and metatranscriptomics have become the principal approaches for discovery of novel bacteriophages and preliminary characterization of their ecology and biology. Metagenomic sequencing dramatically expanded the known diversity of tailed and non-tailed phages with double-stranded DNA genomes and those with single-stranded DNA genomes, whereas metatranscriptomics led to the discovery of thousands of new single-stranded RNA phages. Apart from expanding phage diversity, metagenomics studies discover major novel groups of phages with unique features of genome organization, expression strategy and virus–host interaction, such as the putative order 'crAssvirales', which includes the most abundant human-associated viruses. The continued success of metagenomics hinges on the combination of the most powerful computational methods for phage genome assembly and analysis including harnessing CRISPR spacers for the discovery of novel phages and host assignment. Together, these approaches could make a comprehensive characterization of the earth phageome a realistic goal. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
8. Structural Basis for a Dual Function ATP Grasp Ligase That Installs Single and Bicyclic ω‑Ester Macrocycles in a New Multicore RiPP Natural Product.
- Author
-
Zhao, Gengxiang, Kosek, Dalibor, Liu, Hong-Bing, Ohlemacher, Shannon I., Blackburne, Brittney, Nikolskaya, Anastasia, Makarova, Kira S., Sun, Jiadong, Barry III, Clifton E., Koonin, Eugene V., Dyda, Fred, and Bewley, Carole A.
- Published
- 2021
- Full Text
- View/download PDF
9. Bacterial defense systems exhibit synergistic anti-phage activity.
- Author
-
Wu, Yi, Garushyants, Sofya K., van den Hurk, Anne, Aparicio-Maldonado, Cristian, Kushwaha, Simran Krishnakant, King, Claire M., Ou, Yaqing, Todeschini, Thomas C., Clokie, Martha R.J., Millard, Andrew D., Gençay, Yilmaz Emre, Koonin, Eugene V., and Nobrega, Franklin L.
- Abstract
Bacterial defense against phage predation involves diverse defense systems acting individually and concurrently, yet their interactions remain poorly understood. We investigated >100 defense systems in 42,925 bacterial genomes and identified numerous instances of their non-random co-occurrence and negative association. For several pairs of defense systems significantly co-occurring in Escherichia coli strains, we demonstrate synergistic anti-phage activity. Notably, Zorya II synergizes with Druantia III and ietAS defense systems, while tmn exhibits synergy with co-occurring systems Gabija, Septu I, and PrrC. For Gabija, tmn co-opts the sensory switch ATPase domain, enhancing anti-phage activity. Some defense system pairs that are negatively associated in E. coli show synergy and significantly co-occur in other taxa, demonstrating that bacterial immune repertoires are largely shaped by selection for resistance against host-specific phages rather than negative epistasis. Collectively, these findings demonstrate compatibility and synergy between defense systems, allowing bacteria to adopt flexible strategies for phage defense. [Display omitted] • Co-occurring bacterial defense systems display synergistic anti-phage activity • Zorya II synergizes with Druantia III and ietAS, and tmn synergizes with Gabija and Septu I • Tmn synergizes with defense systems containing sensory switch ATPase domains • Active systems recruit functional domains of inactive systems for enhanced efficacy Bacteria defend against phages using a variety of defense systems, yet their interactions are poorly understood. Wu and Garushyants et al. reveal that these defense systems are generally compatible and, in some instances, interact resulting in synergistic anti-phage effects, conferring an evolutionary advantage on bacteria under specific environmental conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. The healthy human virome: from virus–host symbiosis to disease.
- Author
-
Koonin, Eugene V, Dolja, Valerian V, and Krupovic, Mart
- Abstract
Viruses are ubiquitous, essential components of any ecosystem, and of multicellular organism holobionts. Numerous viruses cause acute infection, killing the host or being cleared by immune system. In many other cases, viruses coexist with the host as symbionts, either temporarily or for the duration of the host's life. Apparently, virus–host relationships span the entire range from aggressive parasitism to mutualism. Here we attempt to delineate the healthy human virome, that is, the entirety of viruses that are present in a healthy human body. The bulk of the healthy virome consists of bacteriophages infecting bacteria in the intestine and other locations. However, a variety of viruses, such as anelloviruses and herpesviruses, and the numerous endogenous retroviruses, persist by replicating in human cells, and these are our primary focus. Crucially, the boundary between symbiotic and pathogenic viruses is fluid such that members of the healthy virome can become pathogens under changing conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
11. Programmable RNA targeting with the single-protein CRISPR effector Cas7-11
- Author
-
Özcan, Ahsen, Krajeski, Rohan, Ioannidi, Eleonora, Lee, Brennan, Gardner, Apolonia, Makarova, Kira S., Koonin, Eugene V., Abudayyeh, Omar O., and Gootenberg, Jonathan S.
- Abstract
CRISPR–Cas interference is mediated by Cas effector nucleases that are either components of multisubunit complexes—in class 1 CRISPR–Cas systems—or domains of a single protein—in class 2 systems1–3. Here we show that the subtype III-E effector Cas7-11 is a single-protein effector in the class 1 CRISPR–Cas systems originating from the fusion of a putative Cas11 domain and multiple Cas7 subunits that are derived from subtype III-D. Cas7-11 from Desulfonema ishimotonii(DiCas7-11), when expressed in Escherichia coli, has substantial RNA interference effectivity against mRNAs and bacteriophages. Similar to many class 2 effectors—and unique among class 1 systems—DiCas7-11 processes pre-CRISPR RNA into mature CRISPR RNA (crRNA) and cleaves RNA at positions defined by the target:spacer duplex, without detectable non-specific activity. We engineered Cas7-11 for RNA knockdown and editing in mammalian cells. We show that Cas7-11 has no effects on cell viability, whereas other RNA-targeting tools (such as short hairpin RNAs and Cas13) show substantial cell toxicity4,5. This study illustrates the evolution of a single-protein effector from multisubunit class 1 effector complexes, expanding our understanding of the diversity of CRISPR systems. Cas7-11 provides the basis for new programmable RNA-targeting tools that are free of collateral activity and cell toxicity.
- Published
- 2021
- Full Text
- View/download PDF
12. Mutation–selection balance and compensatory mechanisms in tumour evolution
- Author
-
Persi, Erez, Wolf, Yuri I., Horn, David, Ruppin, Eytan, Demichelis, Francesca, Gatenby, Robert A., Gillies, Robert J., and Koonin, Eugene V.
- Abstract
Intratumour heterogeneity and phenotypic plasticity, sustained by a range of somatic aberrations, as well as epigenetic and metabolic adaptations, are the principal mechanisms that enable cancers to resist treatment and survive under environmental stress. A comprehensive picture of the interplay between different somatic aberrations, from point mutations to whole-genome duplications, in tumour initiation and progression is lacking. We posit that different genomic aberrations generally exhibit a temporal order, shaped by a balance between the levels of mutations and selective pressures. Repeat instability emerges first, followed by larger aberrations, with compensatory effects leading to robust tumour fitness maintained throughout the tumour progression. A better understanding of the interplay between genetic aberrations, the microenvironment, and epigenetic and metabolic cellular states is essential for early detection and prevention of cancer as well as development of efficient therapeutic strategies.
- Published
- 2021
- Full Text
- View/download PDF
13. Structure and function of virion RNA polymerase of a crAss-like phage
- Author
-
Drobysheva, Arina V., Panafidina, Sofia A., Kolesnik, Matvei V., Klimuk, Evgeny I., Minakhin, Leonid, Yakunina, Maria V., Borukhov, Sergei, Nilsson, Emelie, Holmfeldt, Karin, Yutin, Natalya, Makarova, Kira S., Koonin, Eugene V., Severinov, Konstantin V., Leiman, Petr G., and Sokolova, Maria L.
- Abstract
CrAss-like phages are a recently described expansive group of viruses that includes the most abundant virus in the human gut1–3. The genomes of all crAss-like phages encode a large virion-packaged protein2,4that contains a DFDxD sequence motif, which forms the catalytic site in cellular multisubunit RNA polymerases (RNAPs)5. Here, using Cellulophaga balticacrAss-like phage phi14:2 as a model system, we show that this protein is a DNA-dependent RNAP that is translocated into the host cell along with the phage DNA and transcribes early phage genes. We determined the crystal structure of this 2,180-residue enzyme in a self-inhibited state, which probably occurs before virion packaging. This conformation is attained with the help of a cleft-blocking domain that interacts with the active site and occupies the cavity in which the RNA–DNA hybrid binds. Structurally, phi14:2 RNAP is most similar to eukaryotic RNAPs that are involved in RNA interference6,7, although most of the phi14:2 RNAP structure (nearly 1,600 residues) maps to a new region of the protein fold space. Considering this structural similarity, we propose that eukaryal RNA interference polymerases have their origins in phage, which parallels the emergence of the mitochondrial transcription apparatus8.
- Published
- 2021
- Full Text
- View/download PDF
14. Expanded diversity of Asgard archaea and their relationships with eukaryotes
- Author
-
Liu, Yang, Makarova, Kira S., Huang, Wen-Cong, Wolf, Yuri I., Nikolskaya, Anastasia N., Zhang, Xinxu, Cai, Mingwei, Zhang, Cui-Jing, Xu, Wei, Luo, Zhuhua, Cheng, Lei, Koonin, Eugene V., and Li, Meng
- Abstract
Asgard is a recently discovered superphylum of archaea that appears to include the closest archaeal relatives of eukaryotes1–5. Debate continues as to whether the archaeal ancestor of eukaryotes belongs within the Asgard superphylum or whether this ancestor is a sister group to all other archaea (that is, a two-domain versus a three-domain tree of life)6–8. Here we present a comparative analysis of 162 complete or nearly complete genomes of Asgard archaea, including 75 metagenome-assembled genomes that—to our knowledge—have not previously been reported. Our results substantially expand the phylogenetic diversity of Asgard and lead us to propose six additional phyla that include a deep branch that we have provisionally named Wukongarchaeota. Our phylogenomic analysis does not resolve unequivocally the evolutionary relationship between eukaryotes and Asgard archaea, but instead—depending on the choice of species and conserved genes used to build the phylogeny—supports either the origin of eukaryotes from within Asgard (as a sister group to the expanded Heimdallarchaeota–Wukongarchaeota branch) or a deeper branch for the eukaryote ancestor within archaea. Our comprehensive protein domain analysis using the 162 Asgard genomes results in a major expansion of the set of eukaryotic signature proteins. The Asgard eukaryotic signature proteins show variable phyletic distributions and domain architectures, which is suggestive of dynamic evolution through horizontal gene transfer, gene loss, gene duplication and domain shuffling. The phylogenomics of the Asgard archaea points to the accumulation of the components of the mobile archaeal ‘eukaryome’ in the archaeal ancestor of eukaryotes (within or outside Asgard) through extensive horizontal gene transfer.
- Published
- 2021
- Full Text
- View/download PDF
15. The LUCA and its complex virome
- Author
-
Krupovic, Mart, Dolja, Valerian V., and Koonin, Eugene V.
- Abstract
The last universal cellular ancestor (LUCA) is the most recent population of organisms from which all cellular life on Earth descends. The reconstruction of the genome and phenotype of the LUCA is a major challenge in evolutionary biology. Given that all life forms are associated with viruses and/or other mobile genetic elements, there is no doubt that the LUCA was a host to viruses. Here, by projecting back in time using the extant distribution of viruses across the two primary domains of life, bacteria and archaea, and tracing the evolutionary histories of some key virus genes, we attempt a reconstruction of the LUCA virome. Even a conservative version of this reconstruction suggests a remarkably complex virome that already included the main groups of extant viruses of bacteria and archaea. We further present evidence of extensive virus evolution antedating the LUCA. The presence of a highly complex virome implies the substantial genomic and pan-genomic complexity of the LUCA itself.
- Published
- 2020
- Full Text
- View/download PDF
16. Evolutionary classification of CRISPR–Cas systems: a burst of class 2 and derived variants
- Author
-
Makarova, Kira S., Wolf, Yuri I., Iranzo, Jaime, Shmakov, Sergey A., Alkhnbashi, Omer S., Brouns, Stan J. J., Charpentier, Emmanuelle, Cheng, David, Haft, Daniel H., Horvath, Philippe, Moineau, Sylvain, Mojica, Francisco J. M., Scott, David, Shah, Shiraz A., Siksnys, Virginijus, Terns, Michael P., Venclovas, Česlovas, White, Malcolm F., Yakunin, Alexander F., Yan, Winston, Zhang, Feng, Garrett, Roger A., Backofen, Rolf, van der Oost, John, Barrangou, Rodolphe, and Koonin, Eugene V.
- Abstract
The number and diversity of known CRISPR–Cas systems have substantially increased in recent years. Here, we provide an updated evolutionary classification of CRISPR–Cas systems and casgenes, with an emphasis on the major developments that have occurred since the publication of the latest classification, in 2015. The new classification includes 2 classes, 6 types and 33 subtypes, compared with 5 types and 16 subtypes in 2015. A key development is the ongoing discovery of multiple, novel class 2 CRISPR–Cas systems, which now include 3 types and 17 subtypes. A second major novelty is the discovery of numerous derived CRISPR–Cas variants, often associated with mobile genetic elements that lack the nucleases required for interference. Some of these variants are involved in RNA-guided transposition, whereas others are predicted to perform functions distinct from adaptive immunity that remain to be characterized experimentally. The third highlight is the discovery of numerous families of ancillary CRISPR-linked genes, often implicated in signal transduction. Together, these findings substantially clarify the functional diversity and evolutionary history of CRISPR–Cas.
- Published
- 2020
- Full Text
- View/download PDF
17. Evolutionary entanglement of mobile genetic elements and host defence systems: guns for hire
- Author
-
Koonin, Eugene V., Makarova, Kira S., Wolf, Yuri I., and Krupovic, Mart
- Abstract
All cellular life forms are afflicted by diverse genetic parasites, including viruses and other types of mobile genetic elements (MGEs), and have evolved multiple, diverse defence systems that protect them from MGE assault via different mechanisms. Here, we provide our perspectives on how recent evidence points to tight evolutionary connections between MGEs and defence systems that reach far beyond the proverbial arms race. Defence systems incur a fitness cost for the hosts; therefore, at least in prokaryotes, horizontal mobility of defence systems, mediated primarily by MGEs, is essential for their persistence. Moreover, defence systems themselves possess certain features of selfish elements. Common components of MGEs, such as site-specific nucleases, are ‘guns for hire’ that can also function as parts of defence mechanisms and are often shuttled between MGEs and defence systems. Thus, evolutionary and molecular factors converge to mould the multifaceted, inextricable connection between MGEs and anti-MGE defence systems.
- Published
- 2020
- Full Text
- View/download PDF
18. Discovery of Oligonucleotide Signaling Mediated by CRISPR-Associated Polymerases Solves Two Puzzles but Leaves an Enigma
- Author
-
Koonin, Eugene V. and Makarova, Kira S.
- Abstract
The signature component of type III CRISPR-Cas systems is the Cas10 protein that consists of two Palm domains homologous to those of DNA and RNA polymerases and nucleotide cyclases and an HD nuclease domain. However, until very recently, the activity of the Palm domains and their role in CRISPR function have not been experimentally established. Most of the type III CRISPR-Cas systems and some type I systems also encompass proteins containing the CARF (CRISPR-associated Rossmann fold) domain that has been predicted to regulate CRISPR functions vianucleotide binding, but its function in CRISPR-Cas remained obscure. Two independent recent studies show that the Palm domain of Cas10 catalyzes synthesis of oligoadenylates, which bind the CARF domain of the Csm6 protein and activate its RNase domain that cleaves foreign transcripts enabling interference by type III CRISPR-Cas. In one coup, these findings resolved two long-standing puzzles of CRISPR biology and reveal a new regulatory pathway that governs the CRISPR response. However, the full extent of this pathway, and especially the driving forces behind the evolution of this complex mechanism of CRISPR-Cas activation, remains to be uncovered.
- Published
- 2024
- Full Text
- View/download PDF
19. Megataxonomy and global ecology of the virosphere
- Author
-
Koonin, Eugene V, Kuhn, Jens H, Dolja, Valerian V, and Krupovic, Mart
- Abstract
Nearly all organisms are hosts to multiple viruses that collectively appear to be the most abundant biological entities in the biosphere. With recent advances in metagenomics and metatranscriptomics, the known diversity of viruses substantially expanded. Comparative analysis of these viruses using advanced computational methods culminated in the reconstruction of the evolution of major groups of viruses and enabled the construction of a virus megataxonomy, which has been formally adopted by the International Committee on Taxonomy of Viruses. This comprehensive taxonomy consists of six virus realms, which are aspired to be monophyletic and assembled based on the conservation of hallmark proteins involved in capsid structure formation or genome replication. The viruses in different major taxa substantially differ in host range and accordingly in ecological niches. In this review article, we outline the latest developments in virus megataxonomy and the recent discoveries that will likely lead to reassessment of some major taxa, in particular, split of three of the current six realms into two or more independent realms. We then discuss the correspondence between virus taxonomy and the distribution of viruses among hosts and ecological niches, as well as the abundance of viruses versus cells in different habitats. The distribution of viruses across environments appears to be primarily determined by the host ranges, i.e. the virome is shaped by the composition of the biome in a given habitat, which itself is affected by abiotic factors.
- Published
- 2024
- Full Text
- View/download PDF
20. The depths of virus exaptation.
- Author
-
Koonin, Eugene V and Krupovic, Mart
- Abstract
Graphical Abstract Highlights • Viruses and their components are repeatedly exapted for diverse host functions. • Virus components are often exapted to function in antiviral defense. • Defective viruses are employed for gene transfer and nutrient storage. • Retroviruses and LTR retrotransposons are a rich source of new cellular functions. Viruses are ubiquitous parasites of cellular life forms and the most abundant biological entities on earth. The relationships between viruses and their hosts involve the continuous arms race but are by no account limited to it. Growing evidence shows that, in the course of evolution, viruses and their components are repeatedly recruited (exapted) for host functions. The functions of exapted viruses typically involve either defense from other viruses or cellular competitors or transfer of nucleic acids between cells, or storage functions. Virus exaptation can reach different depths, from recruitment of a fully functional virus to exploitation of defective, partially degraded viruses, to utilization of individual virus proteins. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
21. Systematic prediction of functionally linked genes in bacterial and archaeal genomes
- Author
-
Shmakov, Sergey A., Faure, Guilhem, Makarova, Kira S., Wolf, Yuri I., Severinov, Konstantin V., and Koonin, Eugene V.
- Abstract
Functionally linked genes in bacterial and archaeal genomes are often organized into operons. However, the composition and architecture of operons are highly variable and frequently differ even among closely related genomes. Therefore, to efficiently extract reliable functional predictions for uncharacterized genes from comparative analyses of the rapidly growing genomic databases, dedicated computational approaches are required. We developed a protocol to systematically and automatically identify genes that are likely to be functionally associated with a ‘bait’ gene or locus by using relevance metrics. Given a set of bait loci and a genomic database defined by the user, this protocol compares the genomic neighborhoods of the baits to identify genes that are likely to be functionally linked to the baits by calculating the abundance of a given gene within and outside the bait neighborhoods and the distance to the bait. We exemplify the performance of the protocol with three test cases, namely, genes linked to CRISPR–Cas systems using the ‘CRISPRicity’ metric, genes associated with archaeal proviruses and genes linked to Argonaute genes in halobacteria. The protocol can be run by users with basic computational skills. The computational cost depends on the sizes of the genomic dataset and the list of reference loci and can vary from one CPU-hour to hundreds of hours on a supercomputer.
- Published
- 2019
- Full Text
- View/download PDF
22. Comparative genomics and evolution of trans-activating RNAs in Class 2 CRISPR-Cas systems
- Author
-
Faure, Guilhem, Shmakov, Sergey A., Makarova, Kira S., Wolf, Yuri I., Crawley, Alexandra B., Barrangou, Rodolphe, and Koonin, Eugene V.
- Abstract
ABSTRACTTrans-activating CRISPR (tracr) RNA is a distinct RNA species that interacts with the CRISPR (cr) RNA to form the dual guide (g) RNA in type II and subtype V-B CRISPR-Cas systems. The tracrRNA-crRNA interaction is essential for pre-crRNA processing as well as target recognition and cleavage. The tracrRNA consists of an antirepeat, which forms an imperfect hybrid with the repeat in the crRNA, and a distal region containing a Rho-independent terminator. Exhaustive comparative analysis of the sequences and predicted structures of the Class 2 CRISPR guide RNAs shows that all these guide RNAs share distinct structural features, in particular, the nexus stem-loop that separates the repeat-antirepeat hybrid from the distal portion of the tracrRNA and the conserved GU pair at that end of the hybrid. These structural constraints might ensure full exposure of the spacer for target recognition. Reconstruction of tracrRNA evolution for 4 tight bacterial groups demonstrates random drift of repeat-antirepeat complementarity within a window of hybrid stability that is, apparently, maintained by selection. An evolutionary scenario is proposed whereby tracrRNAs evolved on multiple occasions, viarearrangement of a CRISPR array to form the antirepeat in different locations with respect to the array. A functional tracrRNA would form if, in the new location, the antirepeat is flanked by sequences that meet the minimal requirements for a promoter and a Rho-independent terminator. Alternatively, or additionally, the antirepeat sequence could be occasionally ‘reset’ by recombination with a repeat, restoring the functionality of tracrRNAs that drift beyond the required minimal hybrid stability.
- Published
- 2019
- Full Text
- View/download PDF
23. How Does Large-Scale Genomic Analysis Shape Our Understanding of COVID Variants in Real Time?
- Author
-
van Dorp, Lucy, Shey, Muki S., Ghedin, Elodie, Michor, Franziska, Koonin, Eugene V., and Hampson, Katie
- Published
- 2021
- Full Text
- View/download PDF
24. Polintons, virophages and transpovirons: a tangled web linking viruses, transposons and immunity.
- Author
-
Koonin, Eugene V and Krupovic, Mart
- Abstract
Virophages are satellite DNA viruses that depend for their replication on giant viruses of the family Mimiviridae . An evolutionary relationship exists between the virophages and Polintons, large self-synthesizing transposons that are wide spread in the genomes of diverse eukaryotes. Most of the Polintons encode homologs of major and minor icosahedral virus capsid proteins and accordingly are predicted to form virions. Additionally, metagenome analysis has led to the discovery of an expansive family of Polinton-like viruses (PLV) that are more distantly related to bona fide Polintons and virophages. Another group of giant virus parasites includes small, linear, double-stranded DNA elements called transpovirons. Recent in-depth comparative genomic analysis has yielded evidence of the origin of the PLV and the transpovirons from Polintons. Integration of virophage genomes into genomes of both giant viruses and protists has been demonstrated. Furthermore, in an experimental coinfection system that consisted of a protist host, a giant virus and an associated virophage, the virophage integrated into the host genome and, after activation of its expression by a superinfecting giant virus, served as an agent of adaptive immunity. There is a striking analogy between this mechanism and the CRISPR-Cas system of prokaryotic adaptive immunity. Taken together, these findings show that Polintons, PLV, virophages and transpovirons form a dynamic network of integrating mobile genetic elements that contribute to the cellular antivirus defense and host–virus coevolution. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
25. The enigmatic archaeal virosphere
- Author
-
Prangishvili, David, Bamford, Dennis H., Forterre, Patrick, Iranzo, Jaime, Koonin, Eugene V., and Krupovic, Mart
- Abstract
One of the most prominent features of archaea is the extraordinary diversity of their DNA viruses. Many archaeal viruses differ substantially in morphology from bacterial and eukaryotic viruses and represent unique virus families. The distinct nature of archaeal viruses also extends to the gene composition and architectures of their genomes and the properties of the proteins that they encode. Environmental research has revealed prominent roles of archaeal viruses in influencing microbial communities in ocean ecosystems, and recent metagenomic studies have uncovered new groups of archaeal viruses that infect extremophiles and mesophiles in diverse habitats. In this Review, we summarize recent advances in our understanding of the genomic and morphological diversity of archaeal viruses and the molecular biology of their life cycles and virus–host interactions, including interactions with archaeal CRISPR–Cas systems. We also examine the potential origins and evolution of archaeal viruses and discuss their place in the global virosphere.
- Published
- 2017
- Full Text
- View/download PDF
26. Diversity and evolution of class 2 CRISPR–Cas systems
- Author
-
Shmakov, Sergey, Smargon, Aaron, Scott, David, Cox, David, Pyzocha, Neena, Yan, Winston, Abudayyeh, Omar O., Gootenberg, Jonathan S., Makarova, Kira S., Wolf, Yuri I., Severinov, Konstantin, Zhang, Feng, and Koonin, Eugene V.
- Abstract
Class 2 CRISPR–Cas systems are characterized by effector modules that consist of a single multidomain protein, such as Cas9 or Cpf1. We designed a computational pipeline for the discovery of novel class 2 variants and used it to identify six new CRISPR–Cas subtypes. The diverse properties of these new systems provide potential for the development of versatile tools for genome editing and regulation. In this Analysis article, we present a comprehensive census of class 2 types and class 2 subtypes in complete and draft bacterial and archaeal genomes, outline evolutionary scenarios for the independent origin of different class 2 CRISPR–Cas systems from mobile genetic elements, and propose an amended classification and nomenclature of CRISPR–Cas.
- Published
- 2017
- Full Text
- View/download PDF
27. Consensus statement: Virus taxonomy in the age of metagenomics
- Author
-
Simmonds, Peter, Adams, Mike J., Benkő, Mária, Breitbart, Mya, Brister, J. Rodney, Carstens, Eric B., Davison, Andrew J., Delwart, Eric, Gorbalenya, Alexander E., Harrach, Balázs, Hull, Roger, King, Andrew M.Q., Koonin, Eugene V., Krupovic, Mart, Kuhn, Jens H., Lefkowitz, Elliot J., Nibert, Max L., Orton, Richard, Roossinck, Marilyn J., Sabanadzovic, Sead, Sullivan, Matthew B., Suttle, Curtis A., Tesh, Robert B., van der Vlugt, René A., Varsani, Arvind, and Zerbini, F. Murilo
- Abstract
The number and diversity of viral sequences that are identified in metagenomic data far exceeds that of experimentally characterized virus isolates. In a recent workshop, a panel of experts discussed the proposal that, with appropriate quality control, viruses that are known only from metagenomic data can, and should be, incorporated into the official classification scheme of the International Committee on Taxonomy of Viruses (ICTV). Although a taxonomy that is based on metagenomic sequence data alone represents a substantial departure from the traditional reliance on phenotypic properties, the development of a robust framework for sequence-based virus taxonomy is indispensable for the comprehensive characterization of the global virome. In this Consensus Statement article, we consider the rationale for why metagenomic sequence data should, and how it can, be incorporated into the ICTV taxonomy, and present proposals that have been endorsed by the Executive Committee of the ICTV.
- Published
- 2017
- Full Text
- View/download PDF
28. Protocol for comparing gene-level selection on coding mutations between two groups of samples with Coselens
- Author
-
Iranzo, Jaime, Gruenhagen, George, Calle-Espinosa, Jorge, and Koonin, Eugene V.
- Abstract
The study of genes that evolve under conditional selection can shed light on the genomic underpinnings of adaptation, revealing epistasis and phenotypic plasticity. This protocol describes how to use the Coselenspackage to compare gene-level selection between two groups of samples. After installing Coselensand preparing the datasets, a typical run on a laptop takes less than 10 min. Coselensis best suited to analyze somatic mutations and data from experimental evolution, for which independently evolved samples are available.
- Published
- 2023
- Full Text
- View/download PDF
29. Classify viruses — the gain is worth the pain
- Author
-
Kuhn, Jens H., Wolf, Yuri I., Krupovic, Mart, Zhang, Yong-Zhen, Maes, Piet, Dolja, Valerian V., and Koonin, Eugene V.
- Abstract
Viruses hold solutions to a lot of problems, so let’s fund and reward cataloguing, urge Jens H. Kuhn and colleagues.
- Published
- 2019
- Full Text
- View/download PDF
30. Correlations between Quantitative Measures of Genome Evolution, Expression and Function.
- Author
-
Eisenhaber, Frank, Wolf, Yuri I., Carmel, Liran, and Koonin, Eugene V.
- Abstract
In addition to multiple, complete genome sequences, genome-wide data on biological properties of genes, such as knockout effect, expression levels, protein-protein interactions, and others, are rapidly accumulating. Numerous attempts were made by many groups to examine connections between these properties and quantitative measures of gene evolution. The questions addressed pertain to the most fundamental aspects of biology: what determines the effect of the knockout of a given gene on the phenotype (in particular, is it essential or not) and the rate of a gene's evolution and how are the phenotypic properties and evolution connected? Many significant correlations were detected, e.g., positive correlation between the tendency of a gene to be lost during evolution and sequence evolution rate, and negative correlations between each of the above measures of evolutionary variability and expression level or the phenotypic effect of gene knockout. However, most of these correlations are relatively weak and explain a small fraction of the variation present in the data. We propose that the majority of the relationships between the phenotypic ("input") and evolutionary ("output") variables can be described with a single, composite variable, the genes "social status in the genomic community", which reflects the biological role of the gene and its mode of evolution. "High-status" genes, involved in house-keeping processes, are more likely to be higher and broader expressed, to have more interaction partners, and to produce lethal or severely impaired knockout mutants. These genes also tend to evolve slower and are less prone to gene loss across various taxonomic groups. "Low-status" genes are expected to be weakly expressed, have fewer interaction partners, and exhibit narrower (and less coherent) phyletic distribution. On average, these genes evolve faster and are more often lost during evolution than high-status genes. The "gene status" notion may serve as a generator of null hypotheses regarding the connections between phenotypic and evolutionary parameters associated with genes. Any deviation from the expected pattern calls for attention—to the quality of the data, the nature of the analyzed relationship, or both. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
31. The Drosophila Protein Interaction Network May Be neither Power-Law nor Scale-Free.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., and Bader, J. S.
- Abstract
Scale-free networks have become a topic of intense interest because of the potential to develop theories universally applicable to networks representing social interactions, internet connectivity, and biological processes. Scale-free topology is associated with power-law distributions of connectivity, in which most network components have only few connections while a very few components are extremely highly-connected. Here we investigate the power-law and scale-free properties of the network corresponding to protein-protein interactions in Drosophila melanogaster. We examine power-law behavior with a standard statistical technique designed to distinguish whether a power-law fit is adequate to describe the vertex degree distribution. We find that the degree distribution for the entire network, consisting of baits and preys, decays faster than power law. This fit may be confounded by artifacts of the screening procedure. The prey-only degree distribution is less likely to be confounded by the screening procedure, and is fit adequately by a power-law. When only the biologically relevant interactions are considered, however, the degree distribution again decays faster than power-law. Thus, power-law behavior may reflect interactions that are observed in vitro but not in vivo. We next describe an algorithm that may be able to extract the true distribution from the incomplete data. Finally, we investigate scale-free properties by characterizing organizational patterns over increasing spatial scales. We provide evidence for the existence of a length-scale that characterizes organization in the network. The existence of such a correlation length stands in contrast to scale-free networks, in which no length scale is special. These results suggest that the Drosophila protein interaction network may not be power-law and is not scale-free. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
32. Scaling Laws in the Functional Content of Genomes.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., and van Nimwegen, Erik
- Published
- 2006
- Full Text
- View/download PDF
33. Neutrality and Selection in the Evolution of Gene Families.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., and Yanai, Itai
- Abstract
Evolutionary relationships among genes, as revealed by sequence similarity, are used to characterize gene families. Surprisingly, a power-law can reasonably describe the distribution of sizes of a genomes gene families. Evolutionary models are able to reproduce the size distribution with simulations of a set of genes growing through duplications and modifications. Most conspicuously, positive selection is not included in the models, suggesting per-haps, that neutral forces determine gene family sizes. Here I advocate this notion with comparative genomic analyses and a review of recent research on the evolution of gene duplicates. I show that a power-law also relates the sizes of orthologous gene families across 66 known microbial genomes. Furthermore, singletons (gene families of size = 1) in one genome have orthologs that are themselves power-law distributed in other genomes. The signature of positive selection, however, is revealed in the fact that gene families of size six and more have a more skewed family sizes distribution across other genomes. The general pleiotropy of genes and the notion that gene duplicates may rapidly subfunctionalize support the conception of gene family growth without positive selection. Such a model runs contrary to Susumu Ohno's famous dictum that only "redundancy created" and suggests a novel view of the evolution of functional novelty. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
34. The Role of Computation in Complex Regulatory Networks.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., Fernández, Pau, and Solé, Ricard V.
- Abstract
Biological phenomena differ significantly from physical phenomena. At the heart of this distinction is the fact that biological entities have computational abilities and thus they are inherently difficult to predict. This is the reason why simplified models that provide the minimal requirements for computation turn out to be very useful to study networks of many components. In this chapter, we briefly review the dynamical aspects of models of regulatory networks, discussing their most salient features, and we also show how these models can give clues about the way in which networks may organize their capacity to evolve, by providing simple examples of the implementation of robustness and modularity. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
35. The Protein Universes.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., and Rackovsky, S.
- Abstract
We discuss some informatic problems in protein classification. We first address a neglected problem in sequence classification-information loss resulting from alphabet contraction. Since the use of reduced alphabets is a standard bioinformatic tool, this is a significant issue. We review recent work in which it was shown that information theoretic methods can be used to quantitate the amount of structural information carried by a specified sequence representation. These tools are then used to construct reduced alphabets of specified size which retain the maximum possible amount of structural information. We then turn to structure classification. After briefly reviewing previous work in this field, we discuss the fact that sequence and structure classification give different pictures of the protein space. We outline ongoing research in which new parameters are sought which explicitly encode architecture choice by protein sequences. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
36. Analytical Evolutionary Model for Protein Fold Occurrence in Genomes, Accounting for the Effects of Gene Duplication, Deletion, Acquisition and Selective Pressure.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., Kamal, Michael, Luscombe, Nicholas M., Jiang Qian, and Gerstein, Mark
- Published
- 2006
- Full Text
- View/download PDF
37. Power Law Correlations in DNA Sequences.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., and Buldyrev, Sergey V.
- Published
- 2006
- Full Text
- View/download PDF
38. Gene Regulatory Networks.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., Dewey, T. Gregory, and Galas, David J.
- Abstract
Two gene regulatory networks inferred from different types of data are considered in this chapter. Gene expression networks are networks inferred from microarray time series data and transcription factor networks are networks obtained from a new genome-wide technique that allows an identification of all of the DNA binding sites for each transcription factor (TF). While addressing the same underlying questions, these networks reflect different properties of gene regulation and provide different insights. The gene expression network is inferred from dynamic analysis of time series data of gene expression profiles. The TF net-works, on the other hand, are a direct result of experimental observation of a physical association between a TF and a DNA binding site, which (except for experimental noise) is unique. While our knowledge of the transcription factor networks is limited, these networks provide insights into a regulatory core network of TFs that regulate each other, and drive all network interconn ectivity. In both cases, the resulting networks show features that may be universal to biological systems. The global properties of such networks show the scale-free distributions of node connectivity indicative of a hierarchical network and also exhibit small world graph properties. We discuss a network growth model based on gene duplication that provides excellent agreement with the global network parameters derived from the analysis of experimental expression data. In addition to these global properties, the local properties of these gene expression networks can be used in data mining and classification. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
39. Scale-Free Evolution.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., Dokholyan, Nikolay V., and Shakhnovich, Eugene I.
- Published
- 2006
- Full Text
- View/download PDF
40. Birth and Death Models of Genome Evolution.
- Author
-
Karev, Georgy P., Wolf, Yuri I., and Koonin, Eugene V.
- Abstract
Gene duplication is the primary avenue of genome evolution. The gene repertoire of any species can be described as an ensemble of paralogous gene families, ranging in size from one to large numbers that amount to a substantial fraction of genes in the respective genome. Evolution of such an ensemble is naturally represented by a birth-and-death process, the birth of a gene being duplication, and death being gene inactivation and elimination. In addition to gene duplication and loss, evolution of gene families involves "true" innovation, i. e., appearance of genes new to the given lineage through horizontal gene transfer, emergence of genes from noncoding sequences, and change of preexisting genes beyond recognition. Assuming these three elementary processes, we developed a simple theoretical frame-work for analysis of genome evolution, the Birth, Death and Innovation Models (BDIMs). Comparison of the predictions made by different versions of BDIMs with empirical distributions of paralogous family size in genomes allows one to choose the adequate models. Stable family size distributions can evolve only under balanced BDIMs, in which duplication and deletion rates are asymptotically equal up to the second order. The linear BDIM, in which there is almost no dependence between the family size and birth-death rates, readily approxi-mates the observed family size distribution at equilibrium. However, the stochastic version of this model yields unrealistic times for evolution of the large paralogous families that were detected in all genomes. In order to produce reasonable rates of family evolution, one needs to turn to nonlinear higher-degree BDIMs, which imply "interactions" between paralogs. These interactions may be interpreted as a proxy for natural selection, which should drive evolution of large paralogous families if their emergence is to be viewed as an adaptive reaction. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
41. The Connectivity of Large Genetic Networks.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., and Wagner, Andreas
- Abstract
I review evolutionary explanations of broad-tailed connectivity or degree distributions observed in metabolic networks and protein interaction networks. Self-assembled chemical reaction networks show degree distributions similar to those observed for metabolic networks, which argues against the postulated role of natural selection in maintaining this degree distribution. In addition, metabolic networks contain traces of their ancient history in the form of highly connected metabolites. Similarly to the degree distribution of metabolic networks, that of protein interaction networks can be explained without resorting to natural selection on the network level. I present data suggesting that highly connected proteins are not distinguishably older than other proteins, and explain this finding with a simple model of how a proteins degree changes in evolutionary time. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
42. Large-Scale Topological Properties of Molecular Networks.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., Maslov, Sergei, and Sneppen, Kim
- Abstract
Bio-molecular networks lack the top-down design. Instead, selective forces of biological evolution shape them from raw material provided by random events such as gene duplications and single gene mutations. As a result individual connections in these networks are characterized by a large degree of randomness. One may wonder which connectivity patterns are indeed random, while which arose due to the network growth, evolution, and/or its fundamental design principles and limitations? Here we introduce a general method allowing one to construct a random null-model version of a given network while preserving the desired set of its low-level topological features, such as, e.g., the number of neighbors of individual nodes, the average level of modularity, preferential connections between particular groups of nodes, etc. Such a null-model network can then be used to detect and quantify the nonrandom topological patterns present in large networks. In particular, we measured correlations between degrees of interacting nodes in protein interaction and regulatory networks in yeast. It was found that in both these networks, links between highly connected proteins are systematically suppressed. This effect decreases the likelihood of cross-talk between different functional modules of the cell, and increases the overall robustness of a network by localizing effects of deleterious perturbations. It also teaches us about the overall computational architecture of such networks and points at the origin of large differences in the number of neighbors of individual nodes. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
43. Graphical Analysis of Biocomplex Networks and Transport Phenomena.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., Kwang-Il Goh, Byungnam Kahng, and Doochul Kim
- Abstract
Many biocomplex networks such as the protein interaction networks and the metabolic networks exhibit an emerging pattern that the distribution of the number of connections of a protein or substrate follows a power law. As the network theory is developed recently, several quantities describing network structure such as modularity and degree-degree correlation have been introduced. Here we investigate and compare the structural properties of the yeast protein networks for different datasets with those quantities. More-over, we introduce a new quantity, called the load, characterizing the amount of signal passing through a vertex. It is shown that the load distribution also follows a power law, and its characteristics are related to the structure of the core part of the biocomplex networks. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
44. Power Laws in Biological Networks.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., Karev, Georgy P., Almaas, Eivind, and Barabási, Albert-László
- Abstract
The rapidly developing theory of complex networks indicates that real networks are not random, but have a highly robust large-scale architecture, governed by strict organizational principles. Here, we focus on the properties of biological networks, discussing their scale-free and hierarchical features. We illustrate the major network characteristics using examples from the metabolic network of the bacterium Escherichia coli. We also discuss the principles of network utilization, acknowledging that the interactions in a real network have unequal strengths. We study the interplay between topology and reaction fluxes provided by flux-balance analysis. We find that the cellular utilization of the metabolic network is both globally and locally highly inhomogeneous, dominated by "hot-spots", rep-resenting connected high-flux pathways. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
45. An Expectation-Maximization Algorithm for Analysis of Evolution of Exon-Intron Structure of Eukaryotic Genes.
- Author
-
McLysaght, Aoife, Huson, Daniel H., Bachrach, Abraham, Carmel, Liran, Rogozin, Igor B., Wolf, Yuri I., and Koonin, Eugene V.
- Abstract
We propose a detailed model of evolution of exon-intron structure of eukaryotic genes that takes into account gene-specific intron gain and loss rates, branch-specific gain and loss coefficients, invariant sites incapable of intron gain, and rate variability of both gain and loss which is gamma-distributed across sites. We develop an expectation-maximization algorithm to estimate the parameters of this model, and study its performance using simulated data. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
46. Universal genome cutters: from selfish genetic elements to antivirus defence and genome editing tools
- Author
-
Koonin, Eugene V.
- Published
- 2016
- Full Text
- View/download PDF
47. A virocentric perspective on the evolution of life.
- Author
-
Koonin, Eugene V and Dolja, Valerian V
- Abstract
Highlights: [•] We present an overview of the evolution of the virus world and its multifaceted, complex interaction with cellular life forms. [•] Several viral hallmark genes that encode essential proteins are shared by extremely diverse groups of viruses. [•] The existence of hallmark genes implies that viruses and related selfish elements evolved from the primordial gene pool. [•] The emergence of selfish elements is theoretically inevitable in any ensemble of evolving replicators. [•] Virus–host arms races and cooperation were among the decisive factors in the evolution of all life forms. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
48. Common origins and host-dependent diversity of plant and animal viromes.
- Author
-
Dolja, Valerian V and Koonin, Eugene V
- Subjects
VIRUS research ,HOST-virus relationships ,PLANT viruses ,CAPSIDS ,VIRAL evolution ,VIRAL genomes ,VIRAL replication ,BIOLOGICAL divergence - Abstract
Many viruses infecting animals and plants share common cores of homologous genes involved in the key processes of viral replication. In contrast, genes that mediate virus–host interactions including in many cases capsid protein (CP) genes are markedly different. There are three distinct scenarios for the origin of related viruses of plants and animals: first, evolution from a common ancestral virus predating the divergence of plants and animals; second, horizontal transfer of viruses, for example, through insect vectors; third, parallel origin from related genetic elements. We present evidence that each of these scenarios contributed, to a varying extent, to the evolution of different groups of viruses. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
49. Computer Analysis of Amino Acid Sequences.
- Author
-
Walker, John M., Foster, Gary D., Taylor, Sally C., Koonin, Eugene V., Mushegian, Arcady R., and Dolja, Valerian V.
- Abstract
Genome sequences are of minimal use without an adequate interpretation of the sequences of putative protein products, which is only possible on the basis of detailed computer analysis. The approaches to amino acid sequence analysis can be roughly divided into those that explore intrinsic properties of proteins, such as hydropathy, secondary structure, distribution of different types of amino acid sequences, and so on, and those that search for sequence similarity. Both approaches include numerous algorithms and computer programs. In this short chapter, we cannot describe all or even the most widely used and valuable of these methods. Instead, we present a minimal set of procedures that, in our experience, is useful in order to extract a substantial amount of information from an amino acid sequence in a relatively short time. For detailed descriptions of various computer methods for sequence analysis, the reader is referred to the recently published reviews and Methods in Enzymology collections (1-3). [ABSTRACT FROM AUTHOR]
- Published
- 1998
- Full Text
- View/download PDF
50. Pervasive conditional selection of driver mutations and modular epistasis networks in cancer
- Author
-
Iranzo, Jaime, Gruenhagen, George, Calle-Espinosa, Jorge, and Koonin, Eugene V.
- Abstract
Cancer driver mutations often display mutual exclusion or co-occurrence, underscoring the key role of epistasis in carcinogenesis. However, estimating the magnitude of epistasis and quantifying its effect on tumor evolution remains a challenge. We develop a method (Coselens) to quantify conditional selection on the excess of nonsynonymous substitutions in cancer genes. Coselensinfers the number of drivers per gene in different partitions of a cancer genomics dataset using covariance-based mutation models and determines whether coding mutations in a gene affect selection for drivers in any other gene. Using Coselens, we identify 296 conditionally selected gene pairs across 16 cancer types in the TCGA dataset. Conditional selection affects 25%–50% of driver substitutions in tumors with >2 drivers. Conditionally co-selected genes form modular networks, whose structures challenge the traditional interpretation of within-pathway mutual exclusivity and across-pathway synergy, suggesting a more complex scenario where gene-specific across-pathway epistasis shapes differentiated cancer subtypes.
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.