150 results on '"Yuri, I."'
Search Results
2. Microbial diversity and ecological complexity emerging from environmental variation and horizontal gene transfer in a simple mathematical model
- Author
-
Babajanyan, Sanasar G., Garushyants, Sofya K., Wolf, Yuri I., and Koonin, Eugene V.
- Published
- 2024
- Full Text
- View/download PDF
3. Host age structure reshapes parasite symbiosis: collaboration begets pathogens, competition begets virulent mutualists
- Author
-
Portner, Carsten O. S., Rong, Edward G., Ramirez, Jared A., Wolf, Yuri I., Bosse, Angelique P., Koonin, Eugene V., and Rochman, Nash D.
- Published
- 2022
- Full Text
- View/download PDF
4. Analysis of lineage-specific protein family variability in prokaryotes combined with evolutionary reconstructions
- Author
-
Karamycheva, Svetlana, Wolf, Yuri I., Persi, Erez, Koonin, Eugene V., and Makarova, Kira S.
- Published
- 2022
- Full Text
- View/download PDF
5. Phylogenomic analysis of the diversity of graspetides and proteins involved in their biosynthesis
- Author
-
Makarova, Kira S., Blackburne, Brittney, Wolf, Yuri I., Nikolskaya, Anastasia, Karamycheva, Svetlana, Espinoza, Marlene, Barry, III, Clifton E., Bewley, Carole A., and Koonin, Eugene V.
- Published
- 2022
- Full Text
- View/download PDF
6. Assessment of assumptions underlying models of prokaryotic pangenome evolution
- Author
-
Sela, Itamar, Wolf, Yuri I., and Koonin, Eugene V.
- Published
- 2021
- Full Text
- View/download PDF
7. Prediction of the incubation period for COVID-19 and future virus disease outbreaks
- Author
-
Gussow, Ayal B., Auslander, Noam, Wolf, Yuri I., and Koonin, Eugene V.
- Published
- 2020
- Full Text
- View/download PDF
8. Modified base-binding EVE and DCD domains: striking diversity of genomic contexts in prokaryotes and predicted involvement in a variety of cellular processes
- Author
-
Bell, Ryan T., Wolf, Yuri I., and Koonin, Eugene V.
- Published
- 2020
- Full Text
- View/download PDF
9. Stable coevolutionary regimes for genetic parasites and their hosts: you must differ to coevolve
- Author
-
Berezovskaya, Faina, Karev, Georgy P., Katsnelson, Mikhail I., Wolf, Yuri I., and Koonin, Eugene V.
- Published
- 2018
- Full Text
- View/download PDF
10. Phylogenomics of prokaryotic ribosomal proteins
- Author
-
Yutin, Natalya, Makarova, Kira S, Wolf, Yuri I, and Koonin, Eugene V
- Published
- 2011
- Full Text
- View/download PDF
11. A comprehensive census of horizontal gene transfers from prokaryotes to unikonts
- Author
-
Puigbò, Pere, Mekhedov, Sergei, Wolf, Yuri I, and Koonin, Eugene V
- Published
- 2011
- Full Text
- View/download PDF
12. A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes
- Author
-
Koonin, Eugene V, Fedorova, Natalie D, Jackson, John D, Jacobs, Aviva R, Krylov, Dmitri M, Makarova, Kira S, Mazumder, Raja, Mekhedov, Sergei L, Nikolskaya, Anastasia N, Rao, B Sridhar, Rogozin, Igor B, Smirnov, Sergei, Sorokin, Alexander V, Sverdlov, Alexander V, Vasudevan, Sona, Wolf, Yuri I, Yin, Jodie J, and Natale, Darren A
- Published
- 2004
- Full Text
- View/download PDF
13. Evolution of mosaic operons by horizontal gene transfer and gene displacement in situ
- Author
-
Omelchenko, Marina V, Makarova, Kira S, Wolf, Yuri I, Rogozin, Igor B, and Koonin, Eugene V
- Published
- 2003
- Full Text
- View/download PDF
14. Evolution of gene fusions: horizontal transfer versus independent events
- Author
-
Yanai, Itai, Wolf, Yuri I, and Koonin, Eugene V
- Published
- 2002
- Full Text
- View/download PDF
15. Selection in the evolution of gene duplications
- Author
-
Kondrashov, Fyodor A, Rogozin, Igor B, Wolf, Yuri I, and Koonin, Eugene V
- Published
- 2002
- Full Text
- View/download PDF
16. Constant relative rate of protein evolution and detection of functional diversification among bacterial, archaeal and eukaryotic proteins
- Author
-
Jordan, I King, Kondrashov, Fyodor A, Rogozin, Igor B, Tatusov, Roman L, Wolf, Yuri I, and Koonin, Eugene V
- Published
- 2001
- Full Text
- View/download PDF
17. Interkingdom gene fusions
- Author
-
Wolf, Yuri I, Kondrashov, Alexey S, and Koonin, Eugene V
- Published
- 2000
- Full Text
- View/download PDF
18. Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs)
- Author
-
Natale, Darren A, Shankavaram, Uma T, Galperin, Michael Y, Wolf, Yuri I, Aravind, L, and Koonin, Eugene V
- Published
- 2000
- Full Text
- View/download PDF
19. Reconstruction of the evolution of microbial defense systems
- Author
-
Yuri I. Wolf, Eugene V. Koonin, Pere Puigbò, Kira S. Makarova, and David M. Kristensen
- Subjects
0301 basic medicine ,Evolution ,030106 microbiology ,Locus (genetics) ,Bacterial genome size ,Biology ,Genome ,03 medical and health sciences ,Phylogenetics ,Genome, Archaeal ,Defense Gene ,Gene Duplication ,QH359-425 ,Genome Editing ,Gene family ,Evolutionary dynamics ,Gene ,Ecology, Evolution, Behavior and Systematics ,Phylogeny ,Genetics ,Clostridium Botulinum ,Likelihood Functions ,Phylogenetic tree ,Bacteria ,Archaea ,Biological Evolution ,Gene Gain ,030104 developmental biology ,Acne ,Prokaryotic Cells ,Evolutionary biology ,Genome, Bacterial ,Research Article - Abstract
Background Evolution of bacterial and archaeal genomes is a highly dynamic process that involves intensive loss of genes as well as gene gain via horizontal transfer, with a lesser contribution from gene duplication. The rates of these processes can be estimated by comparing genomes that are linked by an evolutionary tree. These estimated rates of genome dynamics events substantially differ for different functional classes of genes. The genes involved in defense against viruses and other invading DNA are among those that are gained and lost at the highest rates. Results We employed a stochastic birth-and-death model to obtain maximum likelihood estimates of the rates of gain and loss of defense genes in 35 groups of closely related bacterial genomes and one group of archaeal genomes. We find that on average, the defense genes experience 1.4 fold higher flux than the rest of microbial genes. This excessive flux of defense genes over the genomic mean is consistent across diverse microbial groups. The few exceptions include intracellular parasites with small, degraded genomes that possess few defense systems which are more stable than in other microbes. Generally, defense genes follow the previously established pattern of genome dynamics, with gene family loss being about 3 times more common than gain and an order of magnitude more common than expansion or contraction of gene families. Case by case analysis of the evolutionary dynamics of defense genes indicates frequent multiple events in the same locus and widespread involvement of mobile elements in the gain and loss of defense genes. Conclusions Evolution of microbial defense systems is highly dynamic but, notwithstanding the host-parasite arms race, generally follows the same trends that have been established for the rest of the genes. Apart from the paucity and the low flux of defense genes in parasitic bacteria with deteriorating genomes, there is no clear connection between the evolutionary regime of defense systems and microbial life style. Electronic supplementary material The online version of this article (doi:10.1186/s12862-017-0942-y) contains supplementary material, which is available to authorized users.
- Published
- 2017
20. Inevitability of the emergence and persistence of genetic parasites caused by evolutionary instability of parasite-free states.
- Author
-
Koonin, Eugene V., Wolf, Yuri I., and Katsnelson, Mikhail I.
- Subjects
- *
PARASITES , *GENETICS , *CYTOLOGY , *BIOLOGICAL evolution , *BIOLOGICAL variation , *EXPERIMENTAL biology - Abstract
Genetic parasites, including viruses and mobile genetic elements, are ubiquitous among cellular life forms, and moreover, are the most abundant biological entities on earth that harbor the bulk of the genetic diversity. Here we examine simple thought experiments to demonstrate that both the emergence of parasites in simple replicator systems and their persistence in evolving life forms are inevitable because the putative parasite-free states are evolutionarily unstable. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
21. Phylogenomics of Cas4 family nucleases.
- Author
-
Hudaiberdiev, Sanjarbek, Shmakov, Sergey, Wolf, Yuri I., Terns, Michael P., Makarova, Kira S., and Koonin, Eugene V.
- Subjects
ENDONUCLEASES ,MOBILE genetic elements ,ARCHAEBACTERIA ,MICROBIAL genes ,ANIMAL defenses - Abstract
Background: The Cas4 family endonuclease is a component of the adaptation module in many variants of CRISPR-Cas adaptive immunity systems. Unlike most of the other Cas proteins, Cas4 is often encoded outside CRISPR-cas loci (solo-Cas4) and is also found in mobile genetic elements (MGE-Cas4). Results: As part of our ongoing investigation of CRISPR-Cas evolution, we explored the phylogenomics of the Cas4 family. About 90% of the archaeal genomes encode Cas4 compared to only about 20% of the bacterial genomes. Many archaea encode both the CRISPR-associated form (CAS-Cas4) and solo-Cas4, whereas in bacteria, this combination is extremely rare. The solo-cas4 genes are over-represented in environmental bacteria and archaea with small genomes that typically lack CRISPR-Cas, suggesting that Cas4 could perform uncharacterized defense or repair functions in these microbes. Phylogenomic analysis indicates that both the CRISPR-associated cas4 genes are often transferred horizontally but almost exclusively, as part of the adaptation module. The evolutionary integrity of the adaptation module sharply contrasts the rampant shuffling of CRISPR-cas modules whereby a given variant of the adaptation module can combine with virtually any effector module. The solo-cas4 genes evolve primarily via vertical inheritance and are subject only to occasional horizontal transfer. The selection pressure on cas4 genes does not substantially differ between CAS-Cas4 and solo-cas4 and is close to the genomic median. Thus, cas4 genes, similarly to cas1 and cas2, evolve similarly to 'regular' microbial genes involved in various cellular functions, showing no evidence of direct involvement in virus-host arms races. A notable feature of the Cas4 family evolution is the frequent recruitment of cas4 genes by various mobile genetic elements (MGE), particularly, archaeal viruses. The functions of Cas4 in these elements are unknown and potentially might involve anti-defense roles. Conclusions: Unlike most of the other Cas proteins, Cas4 family members are as often encoded by stand-alone genes as they are incorporated in CRISPR-Cas systems. In addition, cas4 genes were repeatedly recruited by MGE, perhaps, for anti-defense functions. Experimental characterization of the solo and MGE-encoded Cas4 nucleases is expected to reveal currently uncharacterized defense and anti-defense systems and their interactions with CRISPR-Cas systems. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
22. Reconstruction of the evolution of microbial defense systems.
- Author
-
Puigbò, Pere, Makarova, Kira S., Kristensen, David M., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
CHROMOSOME duplication ,GENOMES ,MOLECULAR genetics ,MOLECULAR biology ,MICROORGANISMS - Abstract
Background: Evolution of bacterial and archaeal genomes is a highly dynamic process that involves intensive loss of genes as well as gene gain via horizontal transfer, with a lesser contribution from gene duplication. The rates of these processes can be estimated by comparing genomes that are linked by an evolutionary tree. These estimated rates of genome dynamics events substantially differ for different functional classes of genes. The genes involved in defense against viruses and other invading DNA are among those that are gained and lost at the highest rates. Results: We employed a stochastic birth-and-death model to obtain maximum likelihood estimates of the rates of gain and loss of defense genes in 35 groups of closely related bacterial genomes and one group of archaeal genomes. We find that on average, the defense genes experience 1.4 fold higher flux than the rest of microbial genes. This excessive flux of defense genes over the genomic mean is consistent across diverse microbial groups. The few exceptions include intracellular parasites with small, degraded genomes that possess few defense systems which are more stable than in other microbes. Generally, defense genes follow the previously established pattern of genome dynamics, with gene family loss being about 3 times more common than gain and an order of magnitude more common than expansion or contraction of gene families. Case by case analysis of the evolutionary dynamics of defense genes indicates frequent multiple events in the same locus and widespread involvement of mobile elements in the gain and loss of defense genes. Conclusions: Evolution of microbial defense systems is highly dynamic but, notwithstanding the host-parasite arms race, generally follows the same trends that have been established for the rest of the genes. Apart from the paucity and the low flux of defense genes in parasitic bacteria with deteriorating genomes, there is no clear connection between the evolutionary regime of defense systems and microbial life style. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
23. Phylogenomics of prokaryotic ribosomal proteins
- Author
-
Kira S. Makarova, Eugene V. Koonin, Natalya Yutin, and Yuri I. Wolf
- Subjects
lcsh:Medicine ,Genome ,Genome, Archaeal ,Phylogenomics ,Databases, Genetic ,Genome Databases ,RefSeq ,Archaeal Taxonomy ,lcsh:Science ,Genome Evolution ,Conserved Sequence ,Phylogeny ,Genetics ,0303 health sciences ,Multidisciplinary ,Archaeal Evolution ,Genomics ,Genome project ,Phylogenetics ,Research Article ,Ribosomal Proteins ,Genome evolution ,Archaeans ,Archaeal Proteins ,Molecular Sequence Data ,Computational biology ,Biology ,Microbiology ,Evolution, Molecular ,03 medical and health sciences ,Bacterial Proteins ,Genome Analysis Tools ,Ribosomal protein ,28S ribosomal RNA ,Position-Specific Scoring Matrices ,Evolutionary Systematics ,Amino Acid Sequence ,Gene Prediction ,030304 developmental biology ,Comparative genomics ,Evolutionary Biology ,Bacterial Evolution ,030306 microbiology ,lcsh:R ,Bacterial Taxonomy ,Computational Biology ,Genomic Evolution ,Bacteriology ,Gene Annotation ,Comparative Genomics ,Archaea ,Organismal Evolution ,Human genetics ,Prokaryotic Cells ,Microbial Evolution ,Poster Presentation ,lcsh:Q ,Sequence Alignment ,Genome, Bacterial - Abstract
Archaeal and bacterial ribosomes contain more than 50 proteins, including 34 that are universally conserved in the three domains of cellular life (bacteria, archaea, and eukaryotes). Despite the high sequence conservation, annotation of ribosomal (r-) protein genes is often difficult because of their short lengths and biased sequence composition. We developed an automated computational pipeline for identification of r-protein genes and applied it to 995 completely sequenced bacterial and 87 archaeal genomes available in the RefSeq database. The pipeline employs curated seed alignments of r-proteins to run position-specific scoring matrix (PSSM)-based BLAST searches against six-frame genome translations, mitigating possible gene annotation errors. As a result of this analysis, we performed a census of prokaryotic r-protein complements, enumerated missing and paralogous r-proteins, and analyzed the distributions of ribosomal protein genes among chromosomal partitions. Phyletic patterns of bacterial and archaeal r-protein genes were mapped to phylogenetic trees reconstructed from concatenated alignments of r-proteins to reveal the history of likely multiple independent gains and losses. These alignments, available for download, can be used as search profiles to improve genome annotation of r-proteins and for further comparative genomics studies.
- Published
- 2011
24. The common ancestry of life
- Author
-
Yuri I. Wolf and Eugene V. Koonin
- Subjects
Immunology ,Origin of Life ,Biology ,ENCODE ,General Biochemistry, Genetics and Molecular Biology ,Homology (biology) ,Conserved sequence ,Cellular life ,Animals ,Humans ,lcsh:QH301-705.5 ,Gene ,Ecology, Evolution, Behavior and Systematics ,Conserved Sequence ,Genetics ,Translation system ,Agricultural and Biological Sciences(all) ,Biochemistry, Genetics and Molecular Biology(all) ,Applied Mathematics ,Comment ,Genetic code ,Common ancestry ,Biological Evolution ,lcsh:Biology (General) ,Evolutionary biology ,Genetic Code ,Modeling and Simulation ,General Agricultural and Biological Sciences - Abstract
Background It is common belief that all cellular life forms on earth have a common origin. This view is supported by the universality of the genetic code and the universal conservation of multiple genes, particularly those that encode key components of the translation system. A remarkable recent study claims to provide a formal, homology independent test of the Universal Common Ancestry hypothesis by comparing the ability of a common-ancestry model and a multiple-ancestry model to predict sequences of universally conserved proteins. Results We devised a computational experiment on a concatenated alignment of universally conserved proteins which shows that the purported demonstration of the universal common ancestry is a trivial consequence of significant sequence similarity between the analyzed proteins. The nature and origin of this similarity are irrelevant for the prediction of "common ancestry" of by the model-comparison approach. Thus, homology (common origin) of the compared proteins remains an inference from sequence similarity rather than an independent property demonstrated by the likelihood analysis. Conclusion A formal demonstration of the Universal Common Ancestry hypothesis has not been achieved and is unlikely to be feasible in principle. Nevertheless, the evidence in support of this hypothesis provided by comparative genomics is overwhelming. Reviewers this article was reviewed by William Martin, Ivan Iossifov (nominated by Andrey Rzhetsky) and Arcady Mushegian. For the complete reviews, see the Reviewers' Report section.
- Published
- 2010
25. Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements
- Author
-
John van der Oost, Yuri I. Wolf, Eugene V. Koonin, and Kira S. Makarova
- Subjects
archaeoglobus-fulgidus ,comparative genomics ,chemistry.chemical_compound ,silencing complex ,Plasmid ,RNA interference ,Microbiologie ,Bacteriophages ,Guide RNA ,Eukaryotic Initiation Factors ,lcsh:QH301-705.5 ,Phylogeny ,Genetics ,Deoxyribonucleases ,Genome ,Agricultural and Biological Sciences(all) ,Applied Mathematics ,Argonaute ,Hypothesis ,slicer activity ,Modeling and Simulation ,regulatory rnas ,General Agricultural and Biological Sciences ,Gene Transfer, Horizontal ,Immunology ,Molecular Sequence Data ,Piwi-interacting RNA ,Biology ,Microbiology ,General Biochemistry, Genetics and Molecular Biology ,Amino Acid Sequence ,Gene ,Ecology, Evolution, Behavior and Systematics ,VLAG ,Sequence Homology, Amino Acid ,Biochemistry, Genetics and Molecular Biology(all) ,RNA ,crystal-structure ,structural basis ,aeolicus argonaute ,messenger-rna targets ,Protein Structure, Tertiary ,Interspersed Repetitive Sequences ,lcsh:Biology (General) ,chemistry ,Prokaryotic Cells ,Immune System ,histone deacetylase ,Sequence Alignment ,DNA - Abstract
Background In eukaryotes, RNA interference (RNAi) is a major mechanism of defense against viruses and transposable elements as well of regulating translation of endogenous mRNAs. The RNAi systems recognize the target RNA molecules via small guide RNAs that are completely or partially complementary to a region of the target. Key components of the RNAi systems are proteins of the Argonaute-PIWI family some of which function as slicers, the nucleases that cleave the target RNA that is base-paired to a guide RNA. Numerous prokaryotes possess the CRISPR-associated system (CASS) of defense against phages and plasmids that is, in part, mechanistically analogous but not homologous to eukaryotic RNAi systems. Many prokaryotes also encode homologs of Argonaute-PIWI proteins but their functions remain unknown. Results We present a detailed analysis of Argonaute-PIWI protein sequences and the genomic neighborhoods of the respective genes in prokaryotes. Whereas eukaryotic Ago/PIWI proteins always contain PAZ (oligonucleotide binding) and PIWI (active or inactivated nuclease) domains, the prokaryotic Argonaute homologs (pAgos) fall into two major groups in which the PAZ domain is either present or absent. The monophyly of each group is supported by a phylogenetic analysis of the conserved PIWI-domains. Almost all pAgos that lack a PAZ domain appear to be inactivated, and the respective genes are associated with a variety of predicted nucleases in putative operons. An additional, uncharacterized domain that is fused to various nucleases appears to be a unique signature of operons encoding the short (lacking PAZ) pAgo form. By contrast, almost all PAZ-domain containing pAgos are predicted to be active nucleases. Some proteins of this group (e.g., that from Aquifex aeolicus) have been experimentally shown to possess nuclease activity, and are not typically associated with genes for other (putative) nucleases. Given these observations, the apparent extensive horizontal transfer of pAgo genes, and their common, statistically significant over-representation in genomic neighborhoods enriched in genes encoding proteins involved in the defense against phages and/or plasmids, we hypothesize that pAgos are key components of a novel class of defense systems. The PAZ-domain containing pAgos are predicted to directly destroy virus or plasmid nucleic acids via their nuclease activity, whereas the apparently inactivated, PAZ-lacking pAgos could be structural subunits of protein complexes that contain, as active moieties, the putative nucleases that we predict to be co-expressed with these pAgos. All these nucleases are predicted to be DNA endonucleases, so it seems most probable that the putative novel phage/plasmid-defense system targets phage DNA rather than mRNAs. Given that in eukaryotic RNAi systems, the PAZ domain binds a guide RNA and positions it on the complementary region of the target, we further speculate that pAgos function on a similar principle (the guide being either DNA or RNA), and that the uncharacterized domain found in putative operons with the short forms of pAgos is a functional substitute for the PAZ domain. Conclusion The hypothesis that pAgos are key components of a novel prokaryotic immune system that employs guide RNA or DNA molecules to degrade nucleic acids of invading mobile elements implies a functional analogy with the prokaryotic CASS and a direct evolutionary connection with eukaryotic RNAi. The predictions of the hypothesis including both the activities of pAgos and those of the associated endonucleases are readily amenable to experimental tests. Reviewers This article was reviewed by Daniel Haft, Martijn Huynen, and Chris Ponting.
- Published
- 2009
26. Just how Lamarckian is CRISPR-Cas immunity: the continuum of evolvability mechanisms.
- Author
-
Koonin, Eugene V. and Wolf, Yuri I.
- Subjects
- *
IMMUNITY , *LAMARCKIANISM , *BACTERIAL loci , *GENOMES , *BACTERIAL DNA , *GENETIC mutation - Abstract
The CRISPR-Cas system of prokaryotic adaptive immunity displays features of a mechanism for directional, Lamarckian evolution. Indeed, this system modifies a specific locus in a bacterial or archaeal genome by inserting a piece of foreign DNA into a CRISPR array which results in acquired, heritable resistance to the cognate selfish element. A key element of the Lamarckian scheme is the specificity and directionality of the mutational process whereby an environmental cue causes only mutations that provide specific adaptations to the original challenge. In the case of adaptive immunity, the specificity of mutations is equivalent to self-nonself discrimination. Recent studies on the CRISPR mechanism have shown that the levels of discrimination can substantially differ such that in some CRISPR-Cas variants incorporation of DNA is random whereas discrimination occurs by selection of cells that carry cognate inserts. In other systems, a higher level of specificity appears to be achieved via specialized mechanisms. These findings emphasize the continuity between random and directed mutations and the critical importance of evolved mechanisms that govern the mutational process. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
27. Immunity, suicide or both? Ecological determinants for the combined evolution of anti-pathogen defense systems.
- Author
-
Iranzo, Jaime, Lobkovsky, Alexander E., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
NATURAL immunity ,PROKARYOTES ,COEVOLUTION ,MICROBIAL evolution ,MICROORGANISM populations - Abstract
Background: Parasite-host arms race is one of the key factors in the evolution of life. Most cellular life forms, in particular prokaryotes, possess diverse forms of defense against pathogens including innate immunity, adaptive immunity and programmed cell death (altruistic suicide). Coevolution of these different but interacting defense strategies yields complex evolutionary regimes. Results: We develop and extensively analyze a computational model of coevolution of different defense strategies to show that suicide as a defense mechanism can evolve only in structured populations and when the attainable degree of immunity against pathogens is limited. The general principle of defense evolution seems to be that hosts do not evolve two costly defense mechanisms when one is sufficient. Thus, the evolutionary interplay of innate immunity, adaptive immunity and suicide, leads to an equilibrium state where the combination of all three defense strategies is limited to a distinct, small region of the parameter space. The three strategies can stably coexist only if none of them are highly effective. Coupled adaptive immunity-suicide systems, the existence of which is implied by the colocalization of genes for the two types of defense in prokaryotic genomes, can evolve either when immunity-associated suicide is more efficacious than other suicide systems or when adaptive immunity functionally depends on the associated suicide system. Conclusions: Computational modeling reveals a broad range of outcomes of coevolution of anti-pathogen defense strategies depending on the relative efficacy of different mechanisms and population structure. Some of the predictions of the model appear compatible with recent experimental evolution results and call for additional experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
28. Babela massiliensis, a representative of a widespread bacterial phylum with unusual adaptations to parasitism in amoebae.
- Author
-
Pagnier, Isabelle, Yutin, Natalya, Croce, Olivier, Makarova, Kira S., Wolf, Yuri I., Benamar, Samia, Raoult, Didier, Koonin, Eugene V., and La Scola, Bernard
- Subjects
BIOLOGICAL adaptation ,PARASITISM ,AMOEBIDA ,ARCHAEBACTERIA ,METAGENOMICS - Abstract
Background: Only a small fraction of bacteria and archaea that are identifiable by metagenomics can be grown on standard media. Recent efforts on deep metagenomics sequencing, single-cell genomics and the use of specialized culture conditions (culturomics) increasingly yield novel microbes some of which represent previously uncharacterized phyla and possess unusual biological traits. Results: We report isolation and genome analysis of Babela massiliensis, an obligate intracellular parasite of Acanthamoeba castellanii. B. massiliensis shows an unusual, fission mode of cell multiplication whereby large, polymorphic bodies accumulate in the cytoplasm of infected amoeba and then split into mature bacterial cells. This unique mechanism of cell division is associated with a deep degradation of the cell division machinery and delayed expression of the ftsZ gene. The genome of B. massiliensis consists of a circular chromosome approximately 1.12 megabase in size that encodes, 981 predicted proteins, 38 tRNAs and one typical rRNA operon. Phylogenetic analysis shows that B. massiliensis belongs to the putative bacterial phylum TM6 that so far was represented by the draft genome of the JCVI TM6SC1 bacterium obtained by single cell genomics and numerous environmental sequences. Conclusions: Currently, B. massiliensis is the only cultivated member of the putative TM6 phylum. Phylogenomic analysis shows diverse taxonomic affinities for B. massiliensis genes, suggestive of multiple gene acquisitions via horizontal transfer from other bacteria and eukaryotes. Horizontal gene transfer is likely to be facilitated by the cohabitation of diverse parasites and symbionts inside amoeba. B. massiliensis encompasses many genes encoding proteins implicated in parasite-host interaction including the greatest number of ankyrin repeats among sequenced bacteria and diverse proteins related to the ubiquitin system. Characterization of B. massiliensis, a representative of a distinct bacterial phylum, thanks to its ability to grow in amoeba, reaffirms the critical role of diverse culture approaches in microbiology. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
29. Encapsulated in silica: genome, proteome and physiology of the thermophilic bacterium Anoxybacillus flavithermus WK1
- Author
-
Daniel J. Rigden, Eugene V. Koonin, Shaobin Hou, Jennifer A. Saito, Bruce W. Mountain, Lei Wang, Peter F. Dunfield, Guang Zhao, Junli Wu, Marina V. Omelchenko, Kira S. Makarova, Yuri I. Wolf, Michael Y. Galperin, Dan Li, Matthew B. Stott, Maqsudul Alam, Lu Feng, and Jimmy H. Saw
- Subjects
Genetics ,Whole genome sequencing ,Bacilli ,Hot Temperature ,biology ,Fossils ,Thermophile ,Research ,Anoxybacillus ,biochemical phenomena, metabolism, and nutrition ,biology.organism_classification ,Silicon Dioxide ,Geobacillus ,Genome ,Biochemistry ,Bacterial Proteins ,Proteome ,Water Microbiology ,Bacillaceae ,Bacteria ,Genome, Bacterial ,New Zealand - Abstract
Sequencing of the complete genome of Anoxybacillus flavithermus reveals enzymes that are required for silica adaptation and biofilm formation., Background Gram-positive bacteria of the genus Anoxybacillus have been found in diverse thermophilic habitats, such as geothermal hot springs and manure, and in processed foods such as gelatin and milk powder. Anoxybacillus flavithermus is a facultatively anaerobic bacterium found in super-saturated silica solutions and in opaline silica sinter. The ability of A. flavithermus to grow in super-saturated silica solutions makes it an ideal subject to study the processes of sinter formation, which might be similar to the biomineralization processes that occurred at the dawn of life. Results We report here the complete genome sequence of A. flavithermus strain WK1, isolated from the waste water drain at the Wairakei geothermal power station in New Zealand. It consists of a single chromosome of 2,846,746 base pairs and is predicted to encode 2,863 proteins. In silico genome analysis identified several enzymes that could be involved in silica adaptation and biofilm formation, and their predicted functions were experimentally validated in vitro. Proteomic analysis confirmed the regulation of biofilm-related proteins and crucial enzymes for the synthesis of long-chain polyamines as constituents of silica nanospheres. Conclusions Microbial fossils preserved in silica and silica sinters are excellent objects for studying ancient life, a new paleobiological frontier. An integrated analysis of the A. flavithermus genome and proteome provides the first glimpse of metabolic adaptation during silicification and sinter formation. Comparative genome analysis suggests an extensive gene loss in the Anoxybacillus/Geobacillus branch after its divergence from other bacilli.
- Published
- 2008
30. Estimation of prokaryotic supergenome size and composition from gene frequency distributions.
- Author
-
Lobkovsky, Alexander E., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
- *
PROKARYOTIC genomes , *GENE frequency , *GAMMA distributions , *PROBABILITY theory , *GENETIC transformation - Abstract
Background: Because prokaryotic genomes experience a rapid flux of genes, selection may act at a higher level than an individual genome. We explore a quantitative model of the distributed genome whereby groups of genomes evolve by acquiring genes from a fixed reservoir which we denote as supergenome. Previous attempts to understand the nature of the supergenome treated genomes as random, independent collections of genes and assumed that the supergenome consists of a small number of homogeneous sub-reservoirs. Here we explore the consequences of relaxing both assumptions. Results: We surveyed several methods for estimating the size and composition of the supergenome. The methods assumed that genomes were either random, independent samples of the supergenome or that they evolved from a common ancestor along a known tree via stochastic sampling from the reservoir. The reservoir was assumed to be either a collection of homogeneous sub-reservoirs or alternatively composed of genes with Gamma distributed gain probabilities. Empirical gene frequencies were used to either compute the likelihood of the data directly or first to reconstruct the history of gene gains and then compute the likelihood of the reconstructed numbers of gains. Conclusions: Supergenome size estimates using the empirical gene frequencies directly are not robust with respect to the choice of the model. By contrast, using the gene frequencies and the phylogenetic tree to reconstruct multiple gene gains produces reliable estimates of the supergenome size and indicates that a homogeneous supergenome is more consistent with the data than a supergenome with Gamma distributed gain probabilities. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
31. Genomes in turmoil: Quantification of genome dynamics in prokaryote supergenomes.
- Author
-
Puigbò, Pere, Lobkovsky, Alexander E., Kristensen, David M., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
GENOMES ,PROKARYOTES ,GENETIC transformation ,ARCHAEBACTERIA ,MICROORGANISMS - Abstract
Background Genomes of bacteria and archaea (collectively, prokaryotes) appear to exist in incessant flux, expanding via horizontal gene transfer (HGT) and gene duplication, and contracting via gene loss. However, the actual rates of genome dynamics and relative contributions of different types of events across the diversity of prokaryotes are largely unknown, as are the sizes of microbial supergenomes, i.e. pools of genes that are accessible to the given microbial species. Results We performed a comprehensive analysis of the genome dynamics in 35 groups (34 bacterial and one archaeal) of closely related microbial genomes using a phylogenetic birth-and-death maximum likelihood model to quantify the rates of gene family gain and loss, as well as expansion and reduction. The results show that loss of gene families dominates the evolution of prokaryotes, occurring at approximately three times the rate of gain. The rates of gene family expansion and reduction are typically 7 and 20 times less than the gain and loss rates, respectively. Thus, the prevailing mode of evolution in bacteria and archaea is genome contraction that is partially compensated by the gain of new gene families via horizontal gene transfer. However, the rates of gene family gain, loss, expansion and reduction vary within wide ranges, with the most stable genomes showing rates about 25 times lower than the most dynamic genomes. For many groups, the supergenome estimated from the fraction of repetitive gene family gains includes about 10 fold more gene families than the typical genome in the group although some groups appear to have vast, "open" supergenomes. Conclusions Reconstruction of evolution in groups of closely related bacteria and archaea reveals extremely rapid and highly variable flux of genes in evolving microbial genomes, demonstrates that extensive gene loss and horizontal gene transfer leading to innovation are the two dominant evolutionary processes, and yields robust estimates of the supergenome size. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
32. Pseudo-chaotic oscillations in CRISPR-virus coevolution predicted by bifurcation analysis.
- Author
-
Berezovskaya, Faina S., Wolf, Yuri I., Koonin, Eugene V., and Karev, Georgy P.
- Subjects
- *
CRISPRS , *COEVOLUTION , *BIFURCATION theory , *VIRUS inhibitors , *EPIGENETICS , *COMPARATIVE genomics - Abstract
Background The CRISPR-Cas systems of adaptive antivirus immunity are present in most archaea and many bacteria, and provide resistance to specific viruses or plasmids by inserting fragments of foreign DNA into the host genome and then utilizing transcripts of these spacers to inactivate the cognate foreign genome. The recent development of powerful genome engineering tools on the basis of CRISPR-Cas has sharply increased the interest in the diversity and evolution of these systems. Comparative genomic data indicate that during evolution of prokaryotes CRISPR-Cas loci are lost and acquired via horizontal gene transfer at high rates. Mathematical modeling and initial experimental studies of CRISPR-carrying microbes and viruses reveal complex coevolutionary dynamics. Results We performed a bifurcation analysis of models of coevolution of viruses and microbial host that possess CRISPR-Cas hereditary adaptive immunity systems. The analyzed Malthusian and logistic models display complex, and in particular, quasi-chaotic oscillation regimes that have not been previously observed experimentally or in agent-based models of the CRISPRmediated immunity. The key factors for the appearance of the quasi-chaotic oscillations are the non-linear dependence of the host immunity on the virus load and the partitioning of the hosts into the immune and susceptible populations, so that the system consists of three components. Conclusions Bifurcation analysis of CRISPR-host coevolution model predicts complex regimes including quasi-chaotic oscillations. The quasi-chaotic regimes of virus-host coevolution are likely to be biologically relevant given the evolutionary instability of the CRISPR-Cas loci revealed by comparative genomics. The results of this analysis might have implications beyond the CRISPR-Cas systems, i.e. could describe the behavior of any adaptive immunity system with a heritable component, be it genetic or epigenetic. These predictions are experimentally testable. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
33. Insights into archaeal evolution and symbiosis from the genomes of a nanoarchaeon and its inferred crenarchaeal host from Obsidian Pool, Yellowstone National Park.
- Author
-
Podar, Mircea, Makarova, Kira S., Graham, David E., Wolf, Yuri I., Koonin, Eugene V., and Reysenbach, Anna-Louise
- Subjects
MARINE organisms ,ARCHAEBACTERIA ,ARCHAEBACTERIAL genomes ,BIOSYNTHESIS ,CELL division ,BACTERIA - Abstract
Background: A single cultured marine organism, Nanoarchaeum equitans, represents the Nanoarchaeota branch of symbiotic Archaea, with a highly reduced genome and unusual features such as multiple split genes. Results: The first terrestrial hyperthermophilic member of the Nanoarchaeota was collected from Obsidian Pool, a thermal feature in Yellowstone National Park, separated by single cell isolation, and sequenced together with its putative host, a Sulfolobales archaeon. Both the new Nanoarchaeota (Nst1) and N. equitans lack most biosynthetic capabilities, and phylogenetic analysis of ribosomal RNA and protein sequences indicates that the two form a deepbranching archaeal lineage. However, the Nst1 genome is more than 20% larger, and encodes a complete gluconeogenesis pathway as well as the full complement of archaeal flagellum proteins. With a larger genome, a smaller repertoire of split protein encoding genes and no split non-contiguous tRNAs, Nst1 appears to have experienced less severe genome reduction than N. equitans. These findings imply that, rather than representing ancestral characters, the extremely compact genomes and multiple split genes of Nanoarchaeota are derived characters associated with their symbiotic or parasitic lifestyle. The inferred host of Nst1 is potentially autotrophic, with a streamlined genome and simplified central and energetic metabolism as compared to other Sulfolobales. Conclusions: Comparison of the N. equitans and Nst1 genomes suggests that the marine and terrestrial lineages of Nanoarchaeota share a common ancestor that was already a symbiont of another archaeon. The two distinct Nanoarchaeota-host genomic data sets offer novel insights into the evolution of archaeal symbiosis and parasitism, enabling further studies of the cellular and molecular mechanisms of these relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
34. Updated clusters of orthologous genes for Archaea: a complex ancestor of the Archaea and the byways of horizontal gene transfer.
- Author
-
Wolf, Yuri I., Makarova, Kira S., Yutin, Natalya, and Koonin, Eugene V.
- Subjects
- *
ARCHAEBACTERIA , *GENETICS , *PROKARYOTES , *GENOMICS , *GENETIC transformation - Abstract
Background: Collections of Clusters of Orthologous Genes (COGs) provide indispensable tools for comparative genomic analysis, evolutionary reconstruction and functional annotation of new genomes. Initially, COGs were made for all complete genomes of cellular life forms that were available at the time. However, with the accumulation of thousands of complete genomes, construction of a comprehensive COG set has become extremely computationally demanding and prone to error propagation, necessitating the switch to taxon-specific COG collections. Previously, we reported the collection of COGs for 41 genomes of Archaea (arCOGs). Here we present a major update of the arCOGs and describe evolutionary reconstructions to reveal general trends in the evolution of Archaea. Results: The updated version of the arCOG database incorporates 91% of the pangenome of 120 archaea (251,032 protein-coding genes altogether) into 10,335 arCOGs. Using this new set of arCOGs, we performed maximum likelihood reconstruction of the genome content of archaeal ancestral forms and gene gain and loss events in archaeal evolution. This reconstruction shows that the last Common Ancestor of the extant Archaea was an organism of greater complexity than most of the extant archaea, probably with over 2,500 protein-coding genes. The subsequent evolution of almost all archaeal lineages was apparently dominated by gene loss resulting in genome streamlining. Overall, in the evolution of Archaea as well as a representative set of bacteria that was similarly analyzed for comparison, gene losses are estimated to outnumber gene gains at least 4 to 1. Analysis of specific patterns of gene gain in Archaea shows that, although some groups, in particular Halobacteria, acquire substantially more genes than others, on the whole, gene exchange between major groups of Archaea appears to be largely random, with no major 'highways' of horizontal gene transfer. Conclusions: The updated collection of arCOGs is expected to become a key resource for comparative genomics, evolutionary reconstruction and functional annotation of new archaeal genomes. Given that, in spite of the major increase in the number of genomes, the conserved core of archaeal genes appears to be stabilizing, the major evolutionary trends revealed here have a chance to stand the test of time. Reviewers: This article was reviewed by (for complete reviews see the Reviewers' Reports section): Dr. PLG, Prof. PF, Dr. PL (nominated by Prof. JPG). [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
35. The common ancestry of life.
- Author
-
Koonin, Eugene V. and Wolf, Yuri I.
- Subjects
- *
GENETIC code , *PROTEINS , *BIOMOLECULES , *MOLECULAR genetics - Abstract
Background: It is common belief that all cellular life forms on earth have a common origin. This view is supported by the universality of the genetic code and the universal conservation of multiple genes, particularly those that encode key components of the translation system. A remarkable recent study claims to provide a formal, homology independent test of the Universal Common Ancestry hypothesis by comparing the ability of a commonancestry model and a multiple-ancestry model to predict sequences of universally conserved proteins. Results: We devised a computational experiment on a concatenated alignment of universally conserved proteins which shows that the purported demonstration of the universal common ancestry is a trivial consequence of significant sequence similarity between the analyzed proteins. The nature and origin of this similarity are irrelevant for the prediction of "common ancestry" of by the model-comparison approach. Thus, homology (common origin) of the compared proteins remains an inference from sequence similarity rather than an independent property demonstrated by the likelihood analysis. Conclusion: A formal demonstration of the Universal Common Ancestry hypothesis has not been achieved and is unlikely to be feasible in principle. Nevertheless, the evidence in support of this hypothesis provided by comparative genomics is overwhelming. Reviewers: this article was reviewed by William Martin, Ivan Iossifov (nominated by Andrey Rzhetsky) and Arcady Mushegian. For the complete reviews, see the Reviewers' Report section. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
36. Non-homologous isofunctional enzymes: A systematic analysis of alternative solutions in enzyme evolution.
- Author
-
Omelchenko, Marina V., Galperin, Michael Y., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
ENZYMES ,PROTEINS ,MOLECULAR evolution ,PROTEOMICS ,GENOMICS - Abstract
Background: Evolutionarily unrelated proteins that catalyze the same biochemical reactions are often referred to as analogous - as opposed to homologous - enzymes. The existence of numerous alternative, non-homologous enzyme isoforms presents an interesting evolutionary problem; it also complicates genome-based reconstruction of the metabolic pathways in a variety of organisms. In 1998, a systematic search for analogous enzymes resulted in the identification of 105 Enzyme Commission (EC) numbers that included two or more proteins without detectable sequence similarity to each other, including 34 EC nodes where proteins were known (or predicted) to have distinct structural folds, indicating independent evolutionary origins. In the past 12 years, many putative non-homologous isofunctional enzymes were identified in newly sequenced genomes. In addition, efforts in structural genomics resulted in a vastly improved structural coverage of proteomes, providing for definitive assessment of (non)homologous relationships between proteins. Results: We report the results of a comprehensive search for non-homologous isofunctional enzymes (NISE) that yielded 185 EC nodes with two or more experimentally characterized - or predicted - structurally unrelated proteins. Of these NISE sets, only 74 were from the original 1998 list. Structural assignments of the NISE show over-representation of proteins with the TIM barrel fold and the nucleotide-binding Rossmann fold. From the functional perspective, the set of NISE is enriched in hydrolases, particularly carbohydrate hydrolases, and in enzymes involved in defense against oxidative stress. Conclusions: These results indicate that at least some of the non-homologous isofunctional enzymes were recruited relatively recently from enzyme families that are active against related substrates and are sufficiently flexible to accommodate changes in substrate specificity. Reviewers: This article was reviewed by Andrei Osterman, Keith F. Tipton (nominated by Martijn Huynen) and Igor B. Zhulin. For the full reviews, go to the Reviewers' comments section. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
37. The origins of phagocytosis and eukaryogenesis.
- Author
-
Yutin, Natalya, Wolf, Maxim Y., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
ACTIN ,PHAGOCYTOSIS ,EUKARYOTIC cells ,IMMUNE response ,CYTOSKELETON - Abstract
Background: Phagocytosis, that is, engulfment of large particles by eukaryotic cells, is found in diverse organisms and is often thought to be central to the very origin of the eukaryotic cell, in particular, for the acquisition of bacterial endosymbionts including the ancestor of the mitochondrion. Results: Comparisons of the sets of proteins implicated in phagocytosis in different eukaryotes reveal extreme diversity, with very few highly conserved components that typically do not possess readily identifiable prokaryotic homologs. Nevertheless, phylogenetic analysis of those proteins for which such homologs do exist yields clues to the possible origin of phagocytosis. The central finding is that a subset of archaea encode actins that are not only monophyletic with eukaryotic actins but also share unique structural features with actin-related proteins (Arp) 2 and 3. All phagocytic processes are strictly dependent on remodeling of the actin cytoskeleton and the formation of branched filaments for which Arp2/3 are responsible. The presence of common structural features in Arp2/3 and the archaeal actins suggests that the common ancestors of the archaeal and eukaryotic actins were capable of forming branched filaments, like modern Arp2/3. The Rho family GTPases that are ubiquitous regulators of phagocytosis in eukaryotes appear to be of bacterial origin, so assuming that the host of the mitochondrial endosymbiont was an archaeon, the genes for these GTPases come via horizontal gene transfer from the endosymbiont or in an earlier event. Conclusion: The present findings suggest a hypothetical scenario of eukaryogenesis under which the archaeal ancestor of eukaryotes had no cell wall (like modern Thermoplasma) but had an actinbased cytoskeleton including branched actin filaments that allowed this organism to produce actinsupported membrane protrusions. These protrusions would facilitate accidental, occasional engulfment of bacteria, one of which eventually became the mitochondrion. The acquisition of the endosymbiont triggered eukaryogenesis, in particular, the emergence of the endomembrane system that eventually led to the evolution of modern-type phagocytosis, independently in several eukaryotic lineages. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
38. Complete genome sequence of the extremely acidophilic methanotroph isolate V4, Methylacidiphilum infernorum, a representative of the bacterial phylum Verrucomicrobia.
- Author
-
Hou, Shaobin, Makarova, Kira S., Saw, Jimmy H. W., Senin, Pavel, Ly, Benjamin V., Zhemin Zhou, Yan Ren, Jianmei Wang, Galperin, Michael Y., Omelchenko, Marina V., Wolf, Yuri I., Yutin, Natalya, Koonin, Eugene V., Stott, Matthew B., Mountain, Bruce W., Crowe, Michelle A., Smirnova, Angela V., Dunfield, Peter F., Lu Feng, and Lei Wang
- Subjects
NUCLEOTIDE sequence ,BACTERIAL typing ,METHANOTROPHS ,BACTERIAL genetics ,GENETIC regulation - Abstract
Background: The phylum Verrucomicrobia is a widespread but poorly characterized bacterial clade. Although cultivation-independent approaches detect representatives of this phylum in a wide range of environments, including soils, seawater, hot springs and human gastrointestinal tract, only few have been isolated in pure culture. We have recently reported cultivation and initial characterization of an extremely acidophilic methanotrophic member of the Verrucomicrobia, strain V4, isolated from the Hell's Gate geothermal area in New Zealand. Similar organisms were independently isolated from geothermal systems in Italy and Russia. Results: We report the complete genome sequence of strain V4, the first one from a representative of the Verrucomicrobia. Isolate V4, initially named "Methylokorus infernorum" (and recently renamed Methylacidiphilum infernorum) is an autotrophic bacterium with a streamlined genome of ∼2.3 Mbp that encodes simple signal transduction pathways and has a limited potential for regulation of gene expression. Central metabolism of M. infernorum was reconstructed almost completely and revealed highly interconnected pathways of autotrophic central metabolism and modifications of C
1 -utilization pathways compared to other known methylotrophs. The M. infernorum genome does not encode tubulin, which was previously discovered in bacteria of the genus Prosthecobacter, or close homologs of any other signature eukaryotic proteins. Phylogenetic analysis of ribosomal proteins and RNA polymerase subunits unequivocally supports grouping Planctomycetes, Verrucomicrobia and Chlamydiae into a single clade, the PVC superphylum, despite dramatically different gene content in members of these three groups. Comparative-genomic analysis suggests that evolution of the M. infernorum lineage involved extensive horizontal gene exchange with a variety of bacteria. The genome of M. infernorum shows apparent adaptations for existence under extremely acidic conditions including a major upward shift in the isoelectric points of proteins. Conclusion: The results of genome analysis of M. infernorum support the monophyly of the PVC superphylum. M. infernorum possesses a streamlined genome but seems to have acquired numerous genes including those for enzymes of methylotrophic pathways via horizontal gene transfer, in particular, from Proteobacteria. [ABSTRACT FROM AUTHOR]- Published
- 2008
- Full Text
- View/download PDF
39. Evolutionary primacy of sodium bioenergetics.
- Author
-
Mulkidjanian, Armen Y., Galperin, Michael Y., Makarova, Kira S., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
BIOENERGETICS ,SODIUM ,BIOLOGICAL evolution ,ADENOSINE triphosphatase ,CHROMOSOMAL translocation ,PROTONS ,SODIUM ions - Abstract
Background: The F- and V-type ATPases are rotary molecular machines that couple translocation of protons or sodium ions across the membrane to the synthesis or hydrolysis of ATP. Both the F-type (found in most bacteria and eukaryotic mitochondria and chloroplasts) and V-type (found in archaea, some bacteria, and eukaryotic vacuoles) ATPases can translocate either protons or sodium ions. The prevalent proton-dependent ATPases are generally viewed as the primary form of the enzyme whereas the sodium-translocating ATPases of some prokaryotes are usually construed as an exotic adaptation to survival in extreme environments. Results: We combine structural and phylogenetic analyses to clarify the evolutionary relation between the proton- and sodium-translocating ATPases. A comparison of the structures of the membrane-embedded oligomeric proteolipid rings of sodium-dependent F- and V-ATPases reveals nearly identical sets of amino acids involved in sodium binding. We show that the sodium-dependent ATPases are scattered among proton-dependent ATPases in both the F- and the V-branches of the phylogenetic tree. Conclusion: Barring convergent emergence of the same set of ligands in several lineages, these findings indicate that the use of sodium gradient for ATP synthesis is the ancestral modality of membrane bioenergetics. Thus, a primitive, sodium-impermeable but proton-permeable cell membrane that harboured a set of sodium-transporting enzymes appears to have been the evolutionary predecessor of the more structurally demanding proton-tight membranes. The use of proton as the coupling ion appears to be a later innovation that emerged on several independent occasions. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
40. Evolution of the genetic code: partial optimization of a random code for robustness to translation error in a rugged fitness landscape.
- Author
-
Novozhilov, Artem S., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
- *
GENETIC code , *AMINO acids , *NUCLEOTIDES , *ROBUST control , *TRANSFER RNA , *NUCLEOTIDE sequence - Abstract
Background: The standard genetic code table has a distinctly non-random structure, with similar amino acids often encoded by codons series that differ by a single nucleotide substitution, typically, in the third or the first position of the codon. It has been repeatedly argued that this structure of the code results from selective optimization for robustness to translation errors such that translational misreading has the minimal adverse effect. Indeed, it has been shown in several studies that the standard code is more robust than a substantial majority of random codes. However, it remains unclear how much evolution the standard code underwent, what is the level of optimization, and what is the likely starting point. Results: We explored possible evolutionary trajectories of the genetic code within a limited domain of the vast space of possible codes. Only those codes were analyzed for robustness to translation error that possess the same block structure and the same degree of degeneracy as the standard code. This choice of a small part of the vast space of possible codes is based on the notion that the block structure of the standard code is a consequence of the structure of the complex between the cognate tRNA and the codon in mRNA where the third base of the codon plays a minimum role as a specificity determinant. Within this part of the fitness landscape, a simple evolutionary algorithm, with elementary evolutionary steps comprising swaps of four-codon or two-codon series, was employed to investigate the optimization of codes for the maximum attainable robustness. The properties of the standard code were compared to the properties of four sets of codes, namely, purely random codes, random codes that are more robust than the standard code, and two sets of codes that resulted from optimization of the first two sets. The comparison of these sets of codes with the standard code and its locally optimized version showed that, on average, optimization of random codes yielded evolutionary trajectories that converged at the same level of robustness to translation errors as the optimization path of the standard code; however, the standard code required considerably fewer steps to reach that level than an average random code. When evolution starts from random codes whose fitness is comparable to that of the standard code, they typically reach much higher level of optimization than the standard code, i.e., the standard code is much closer to its local minimum (fitness peak) than most of the random codes with similar levels of robustness. Thus, the standard genetic code appears to be a point on an evolutionary trajectory from a random point (code) about half the way to the summit of the local peak. The fitness landscape of code evolution appears to be extremely rugged, containing numerous peaks with a broad distribution of heights, and the standard code is relatively unremarkable, being located on the slope of a moderate-height peak. Conclusion: The standard code appears to be the result of partial optimization of a random code for robustness to errors of translation. The reason the code is not fully optimized could be the trade-off between the beneficial effect of increasing robustness to translation errors and the deleterious effect of codon series reassignment that becomes increasingly severe with growing complexity of the evolving system. Thus, evolution of the code can be represented as a combination of adaptation and frozen accident. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
41. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea.
- Author
-
Makarova, Kira S., Sorokin, Alexander V., Novichkov, Pavel S., Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
GENES ,GENOMES ,BACTERIA ,ARCHAEBACTERIA ,PROTEINS ,ANTITOXINS - Abstract
Background: An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results: New Archaeal Clusters of Orthologous Genes (arCOGs) were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon) using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome) consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA) is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile that, in addition to the core archaeal functions, encoded more idiosyncratic systems, e.g., the CASS systems of antivirus defense and some toxin-antitoxin systems. Conclusion: The arCOGs provide a convenient, flexible framework for functional annotation of archaeal genomes, comparative genomics and evolutionary reconstructions. Genomic reconstructions suggest that the last common ancestor of archaea might have been (nearly) as advanced as the modern archaeal hyperthermophiles. ArCOGs and related information are available at: ftp://ftp.ncbi.nih.gov/pub/koonin/arCOGs/. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
42. Long intervals of stasis punctuated by bursts of positive selection in the seasonal evolution of influenza A virus.
- Author
-
Wolf, Yuri I., Viboud, Cecile, Holmes, Edward C., Koonin, Eugene V., and Lipman, David J.
- Subjects
- *
INFLUENZA A virus , *HEMAGGLUTININ , *AGGLUTINATION , *VIRAL proteins , *MICROBIAL proteins , *AMINO acids , *EPITOPES - Abstract
Background: The interpandemic evolution of the influenza A virus hemagglutinin (HA) protein is commonly considered a paragon of rapid evolutionary change under positive selection in which amino acid replacements are fixed by virtue of their effect on antigenicity, enabling the virus to evade immune surveillance. Results: We performed phylogenetic analyses of the recently obtained large and relatively unbiased samples of the HA sequences from 1995-2005 isolates of the H3N2 and H1N1 subtypes of influenza A virus. Unexpectedly, it was found that the evolution of H3N2 HA includes long intervals of generally neutral sequence evolution without apparent substantial antigenic change ("stasis" periods) that are characterized by an excess of synonymous over nonsynonymous substitutions per site, lack of association of amino acid replacements with epitope regions, and slow extinction of coexisting virus lineages. These long periods of stasis are punctuated by shorter intervals of rapid evolution under positive selection during which new dominant lineages quickly displace previously coexisting ones. The preponderance of positive selection during intervals of rapid evolution is supported by the dramatic excess of amino acid replacements in the epitope regions of HA compared to replacements in the rest of the HA molecule. In contrast, the stasis intervals showed a much more uniform distribution of replacements over the HA molecule, with a statistically significant difference in the rate of synonymous over nonsynonymous substitution in the epitope regions between the two modes of evolution. A number of parallel amino acid replacements - the same amino acid substitution occurring independently in different lineages - were also detected in H3N2 HA. These parallel mutations were, largely, associated with periods of rapid fitness change, indicating that there are major limitations on evolutionary pathways during antigenic change. The finding that stasis is the prevailing modality of H3N2 evolution suggests that antigenic changes that lead to an increase in fitness typically result from epistatic interactions between several amino acid substitutions in the HA and, perhaps, other viral proteins. The strains that become dominant due to increased fitness emerge from low frequency strains thanks to the last amino acid replacement that completes the set of replacements required to produce a significant antigenic change; no subset of substitutions results in a biologically significant antigenic change and corresponding fitness increase. In contrast to H3N2, no clear intervals of evolution under positive selection were detected for the H1N1 HA during the same time span. Thus, the ascendancy of H1N1 in some seasons is, most likely, caused by the drop in the relative fitness of the previously prevailing H3N2 lineages as the fraction of susceptible hosts decreases during the stasis intervals. Conclusion: We show that the common view of the evolution of influenza virus as a rapid, positive selection-driven process is, at best, incomplete. Rather, the interpandemic evolution of influenza appears to consist of extended intervals of stasis, which are characterized by neutral sequence evolution, punctuated by shorter intervals of rapid fitness increase when evolutionary change is driven by positive selection. These observations have implications for influenza surveillance and vaccine formulation; in particular, the possibility exists that parallel amino acid replacements could serve as a predictor of new dominant strains. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
43. Comparative genomics of Thermus thermophilus and Deinococcus radiodurans: divergent routes of adaptation to thermophily and radiation resistance.
- Author
-
Omelchenko, Marina V, Wolf, Yuri I, Gaidamakova, Elena K, Matrosova, Vera Y, Vasilenko, Alexander, Min Zhai, Daly, Michael J, Koonin, Eugene V, and Makarova, Kira S
- Subjects
- *
GENOMICS , *PHENOTYPES , *THERMOPHILIC microorganisms , *RADIATION , *EVOLUTIONARY theories - Abstract
Background: Thermus thermophilus and Deinococcus radiodurans belong to a distinct bacterial clade but have remarkably different phenotypes. T. thermophilus is a thermophile, which is relatively sensitive to ionizing radiation and desiccation, whereas D. radiodurans is a mesophile, which is highly radiation- and desiccation-resistant. Here we present an in-depth comparison of the genomes of these two related but differently adapted bacteria. Results: By reconstructing the evolution of Thermus and Deinococcus after the divergence from their common ancestor, we demonstrate a high level of post-divergence gene flux in both lineages. Various aspects of the adaptation to high temperature in Thermus can be attributed to horizontal gene transfer from archaea and thermophilic bacteria; many of the horizontally transferred genes are located on the single megaplasmid of Thermus. In addition, the Thermus lineage has lost a set of genes that are still present in Deinococcus and many other mesophilic bacteria but are not common among thermophiles. By contrast, Deinococcus seems to have acquired numerous genes related to stress response systems from various bacteria. A comparison of the distribution of orthologous genes among the four partitions of the Deinococcus genome and the two partitions of the Thermus genome reveals homology between the Thermus megaplasmid (pTT27) and Deinococcus megaplasmid (DR177). Conclusion: After the radiation from their common ancestor, the Thermus and Deinococcus lineages have taken divergent paths toward their distinct lifestyles. In addition to extensive gene loss, Thermus seems to have acquired numerous genes from thermophiles, which likely was the decisive contribution to its thermophilic adaptation. By contrast, Deinococcus lost few genes but seems to have acquired many bacterial genes that apparently enhanced its ability to survive different kinds of environmental stresses. Notwithstanding the accumulation of horizontally transferred genes, we also show that the single megaplasmid of Thermus and the DR177 megaplasmid of Deinococcus are homologous and probably were inherited from the common ancestor of these bacteria. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
44. Gene family evolution: an in-depth theoretical and simulation analysis of non-linear birth-death-innovation models.
- Author
-
Karev, Georgy P., Wolf, Yuri I., Berezovskaya, Faina S., and Koonin, Eugene V.
- Subjects
- *
GENES , *CHILDBIRTH , *DEATH , *GENOMES , *MONTE Carlo method - Abstract
Background: The size distribution of gene families in a broad range of genomes is well approximated by a generalized Pareto function. Evolution of ensembles of gene families can be described with Birth, Death, and Innovation Models (BDIMs). Analysis of the properties of different versions of BDIMs has the potential of revealing important features of genome evolution. Results: In this work, we extend our previous analysis of stochastic BDIMs. In addition to the previously examined rational BDIMs, we introduce potentially more realistic logistic BDIMs, in which birth/death rates are limited for the largest families, and show that their properties are similar to those of models that include no such limitation. We show that the mean time required for the formation of the largest gene families detected in eukaryotic genomes is limited by the mean number of duplications per gene and does not increase indefinitely with the model degree. Instead, this time reaches a minimum value, which corresponds to a non-linear rational BDIM with the degree of approximately 2.7. Even for this BDIM, the mean time of the largest family formation is orders of magnitude greater than any realistic estimates based on the timescale of life's evolution. We employed the embedding chains technique to estimate the expected number of elementary evolutionary events (gene duplications and deletions) preceding the formation of gene families of the observed size and found that the mean number of events exceeds the family size by orders of magnitude, suggesting a highly dynamic process of genome evolution. The variance of the time required for the formation of the largest families was found to be extremely large, with the coefficient of variation >> 1. This indicates that some gene families might grow much faster than the mean rate such that the minimal time required for family formation is more relevant for a realistic representation of genome evolution than the mean time. We determined this minimal time using Monte Carlo simulations of family growth from an ensemble of simultaneously evolving singletons. In these simulations, the time elapsed before the formation of the largest family was much shorter than the estimated mean time and was compatible with the timescale of evolution of eukaryotes. Conclusions: The analysis of stochastic BDIMs presented here shows that non-linear versions of such models can well approximate not only the size distribution of gene families but also the dynamics of their formation during genome evolution. The fact that only higher degree BDIMs are compatible with the observed characteristics of genome evolution suggests that the growth of gene families is self-accelerating, which might reflect differential selective pressure acting on different genes. [ABSTRACT FROM AUTHOR]
- Published
- 2004
- Full Text
- View/download PDF
45. Duplicated genes evolve slower than singletons despite the initial rate increase.
- Author
-
Jordan, I. King, Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
- *
GENES , *EUKARYOTIC cells , *PROTEINS , *GENETICS , *BIOLOGICAL evolution - Abstract
Background: Gene duplication is an important mechanism that can lead to the emergence of new functions during evolution. The impact of duplication on the mode of gene evolution has been the subject of several theoretical and empirical comparative-genomic studies. It has been shown that, shortly after the duplication, genes seem to experience a considerable relaxation of purifying selection. Results: Here we demonstrate two opposite effects of gene duplication on evolutionary rates. Sequence comparisons between paralogs show that, in accord with previous observations, a substantial acceleration in the evolution of paralogs occurs after duplication, presumably due to relaxation of purifying selection. The effect of gene duplication on evolutionary rate was also assessed by sequence comparison between orthologs that have paralogs (duplicates) and those that do not (singletons). It is shown that, in eukaryotes, duplicates, on average, evolve significantly slower than singletons. Eukaryotic ortholog evolutionary rates for duplicates are also negatively correlated with the number of paralogs per gene and the strength of selection between paralogs. A tally of annotated gene functions shows that duplicates tend to be enriched for proteins with known functions, particularly those involved in signaling and related cellular processes; by contrast, singletons include an over-abundance of poorly characterized proteins. Conclusions: These results suggest that whether or not a gene duplicate is retained by selection depends critically on the pre-existing functional utility of the protein encoded by the ancestral singleton. Duplicates of genes of a higher biological import, which are subject to strong functional constraints on the sequence, are retained relatively more often. Thus, the evolutionary trajectory of duplicated genes appears to be determined by two opposing trends, namely, the post-duplication rate acceleration and the generally slow evolutionary rate owing to the high level of functional constraints. [ABSTRACT FROM AUTHOR]
- Published
- 2004
- Full Text
- View/download PDF
46. No simple dependence between protein evolution rate and the number of protein-protein interactions: only the most prolific interactors tend to evolve slowly.
- Author
-
Jordan, I. King, Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
- *
PROTEINS , *BIOLOGICAL evolution , *PROTEIN-protein interactions , *AMINO acids , *HELICOBACTER pylori - Abstract
Background: It has been suggested that rates of protein evolution are influenced, to a great extent, by the proportion of amino acid residues that are directly involved in protein function. In agreement with this hypothesis, recent work has shown a negative correlation between evolutionary rates and the number of protein-protein interactions. However, the extent to which the number of protein-protein interactions influences evolutionary rates remains unclear. Here, we address this question at several different levels of evolutionary relatedness. Results: Manually curated data on the number of protein-protein interactions among Saccharomyces cerevisiae proteins was examined for possible correlation with evolutionary rates between S. cerevisiae and Schizosaccharomyces pombe orthologs. Only a very weak negative correlation between the number of interactions and evolutionary rate of a protein was observed. Furthermore, no relationship was found between a more general measure of the evolutionary conservation of S. cerevisiae proteins, based on the taxonomic distribution of their homologs, and the number of protein-protein interactions. However, when the proteins from yeast were assorted into discrete bins according to the number of interactions, it turned out that 6.5% of the proteins with the greatest number of interactions evolved, on average, significantly slower than the rest of the proteins. Comparisons were also performed using protein-protein interaction data obtained with high-throughput analysis of Helicobacter pylori proteins. No convincing relationship between the number of protein-protein interactions and evolutionary rates was detected, either for comparisons of orthologs from two completely sequenced H. pylori strains or for comparisons of H. pylori and Campylobacter jejuni orthologs, even when the proteins were classified into bins by the number of interactions. Conclusion: The currently available comparative-genomic data do not support the hypothesis that the evolutionary rates of the majority of proteins substantially depend on the number of proteinprotein interactions they are involved in. However, a small fraction of yeast proteins with the largest number of interactions (the hubs of the interaction network) tend to evolve slower than the bulk of the proteins. [ABSTRACT FROM AUTHOR]
- Published
- 2003
- Full Text
- View/download PDF
47. The COG database: an updated version includes eukaryotes.
- Author
-
Tatusov, Roman L., Fedorova, Natalie D., Jackson, John D., Jacobs, Aviva R., Kiryutin, Boris, Koonin, Eugene V., Krylov, Dmitri M., Mazumder, Raja, Mekhedov, Sergei L., Nikolskaya, Anastasia N., Rao, B. Sridhar, Smirnov, Sergei, Sverdlov, Alexander V., Vasudevan, Sona, Wolf, Yuri I., Yin, Jodie J., and Natale, Darren A.
- Subjects
DATABASES ,PROTEINS ,PROKARYOTES ,GENOMES ,GENES - Abstract
Background: The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. Results: We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (~1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes. Conclusion: The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies. [ABSTRACT FROM AUTHOR]
- Published
- 2003
- Full Text
- View/download PDF
48. Birth and death of protein domains: A simple model of evolution explains power law behavior.
- Author
-
Karev, Georgy P., Wolf, Yuri I., Rzhetsky, Andrey Y., Berezovskaya, Faina S., and Koonin, Eugene V.
- Subjects
- *
PROTEINS , *DEATH (Biology) , *ENZYMES , *METABOLITES , *PROTEOMICS , *BIOLOGICAL evolution , *CELL death - Abstract
Background: Power distributions appear in numerous biological, physical and other contexts, which appear to be fundamentally different. In biology, power laws have been claimed to describe the distributions of the connections of enzymes and metabolites in metabolic networks, the number of interactions partners of a given protein, the number of members in paralogous families, and other quantities. In network analysis, power laws imply evolution of the network with preferential attachment, i.e. a greater likelihood of nodes being added to pre-existing hubs. Exploration of different types of evolutionary models in an attempt to determine which of them lead to power law distributions has the potential of revealing non-trivial aspects of genome evolution. Results: A simple model of evolution of the domain composition of proteomes was developed, with the following elementary processes: i) domain birth (duplication with divergence), ii) death (inactivation and/or deletion), and iii) innovation (emergence from non-coding or non-globular sequences or acquisition via horizontal gene transfer). This formalism can be described as a birth, death and innovation model (BDIM). The formulas for equilibrium frequencies of domain families of different size and the total number of families at equilibrium are derived for a general BDIM. All asymptotics of equilibrium frequencies of domain families possible for the given type of models are found and their appearance depending on model parameters is investigated. It is proved that the power law asymptotics appears if, and only if, the model is balanced, i.e. domain duplication and deletion rates are asymptotically equal up to the second order. It is further proved that any power asymptotic with the degree not equal to -1 can appear only if the hypothesis of independence of the duplication/deletion rates on the size of a domain family is rejected. Specific cases of BDIMs, namely simple, linear, polynomial and rational models, are considered in details and the distributions of the equilibrium frequencies of domain families of different size are determined for each case. We apply the BDIM formalism to the analysis of the domain family size distributions in prokaryotic and eukaryotic proteomes and show an excellent fit between these empirical data and a particular form of the model, the second-order balanced linear BDIM. Calculation of the parameters of these models suggests surprisingly high innovation rates, comparable to the total domain birth (duplication) and elimination rates, particularly for prokaryotic genomes. Conclusions: We show that a straightforward model of genome evolution, which does not explicitly include selection, is sufficient to explain the observed distributions of domain family sizes, in which power laws appear as asymptotic. However, for the model to be compatible with the data, there has to be a precise balance between domain birth, death and innovation rates, and this is likely to be maintained by selection. The developed approach is oriented at a mathematical description of evolution of domain composition of proteomes, but a simple reformulation could be applied to models of other evolving networks with preferential attachment. [ABSTRACT FROM AUTHOR]
- Published
- 2002
- Full Text
- View/download PDF
49. Genome trees constructed using five different approaches suggest new major bacterial clades.
- Author
-
Wolf, Yuri I., Rogozin, Igor B., Grishin, Nick V., Tatusov, Roman L., and Koonin, Eugene V.
- Subjects
- *
CLADISTIC analysis , *GENOMES , *NUCLEOTIDE sequence , *PHYLOGENY , *GENETIC transformation , *PROKARYOTES , *GENES - Abstract
Background: The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. Results: Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the low-GC Gram-positive bacteria at a deeper tree node. These new groupings of bacteria were supported by the analysis of alternative topologies in the concatenated ribosomal protein tree using the Kishino- Hasegawa test and by a census of the topologies of 132 individual groups of orthologous proteins. Additionally, the results of this analysis put into question the sister-group relationship between the two major archaeal groups, Euryarchaeota and Crenarchaeota, and suggest instead that Euryarchaeota might be a paraphyletic group with respect to Crenarchaeota. Conclusions: We conclude that, the extensive horizontal gene flow and lineage-specific gene loss notwithstanding, extension of phylogenetic analysis to the genome scale has the potential of uncovering deep evolutionary relationships between prokaryotic lineages. [ABSTRACT FROM AUTHOR]
- Published
- 2001
- Full Text
- View/download PDF
50. Seeing the Tree of Life behind the phylogenetic forest.
- Author
-
Puigbò, Pere, Wolf, Yuri I., and Koonin, Eugene V.
- Subjects
- *
COMPARATIVE genomics , *BIOLOGICAL evolution , *PHYLOGENY - Abstract
A letter to the editor is presented in response to the article "Search for a Tree of Life in the thicket of the phylogenetic forest" by P. Puigbo and colleagues in a 2009 issue.
- Published
- 2013
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.