Author: "Kemena C" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kemena C"' showing total 23 results

Start Over Author "Kemena C"

23 results on '"Kemena C"'

1. Alignathon: A competitive assessment of whole-genome alignment methods

Author: Earl, D, Nguyen, N, Hickey, G, Harris, RS, Fitzgerald, S, Beal, K, Seledtsov, I, Molodtsov, V, Raney, BJ, Clawson, H, Kim, J, Kemena, C, Chang, JM, Erb, I, Poliakov, A, Hou, M, Herrero, J, Kent, WJ, Solovyev, V, Darling, AE, Ma, J, Notredame, C, Brudno, M, Dubchak, I, Haussler, D, Paten, B, Earl, D, Nguyen, N, Hickey, G, Harris, RS, Fitzgerald, S, Beal, K, Seledtsov, I, Molodtsov, V, Raney, BJ, Clawson, H, Kim, J, Kemena, C, Chang, JM, Erb, I, Poliakov, A, Hou, M, Herrero, J, Kent, WJ, Solovyev, V, Darling, AE, Ma, J, Notredame, C, Brudno, M, Dubchak, I, Haussler, D, and Paten, B
Abstract: © 2014 Earl et al. Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments.
Published: 2014

2. Enhancing the Scalability of Consistency-based Progressive Multiple Sequences Alignment Applications.

Author: Orobitg, M., Cores, F., Guirado, F., Kemena, C., Notredame, C., and Ripoll, A.
Abstract: Multiple Sequence Alignment (MSA) is an extremely powerful tool for important biological applications, such as phylogenetic analysis, identification of conserved motifs and domains and structure prediction. In this paper we propose a new approach to reduce the computational requirements of TCoffee, a memory demanding MSA tool that uses a consistency based scheme to produce more accurate alignments. Our goal is to minimize the memory constraints in order to increase the performance and scalability of the application. The experimental results show that our approach is able to reduce the memory consumption and increase both the time performance and the number and the length of sequences that the method can align. In summary, it is able to reduce the memory requirements by between 72% and 58% depending on the optimization level, improving the alignment scalability as a whole. Also, the library reduction yields a further improvement: the alignment execution time can be reduced by up to 92%. These results are obtained without a significant impact on the final alignment quality, that declines by less than 3%. [ABSTRACT FROM PUBLISHER]
Published: 2012
Full Text: View/download PDF

3. Domain similarity based orthology detection

Author: Bitard-Feildel, T. (Tristan), Kemena, C. (Carsten), Greenwood, J.M. (Jenny), Bornberg-Bauer, E. (Erich), and Universitäts- und Landesbibliothek Münster
Subjects: ComputingMethodologies_PATTERNRECOGNITION, ddc:570, Domain, Domain similarity, Orthology, Similarity, Biology, Biochemistry, Molecular Biology, Computer Science Applications
Abstract: Background: Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. Results: We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. Conclusion: We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda.
Full Text: View/download PDF

4. Using tertiary structure for the computation of highly accurate multiple RNA alignments with the SARA-Coffee package

Author: Emidio Capriotti, Marc A. Marti-Renom, Carsten Kemena, Giovanni Bussotti, Cedric Notredame, Kemena, C., Bussotti, G., Capriotti, E., Marti-Renom, M. A., and Notredame, C.
Subjects: Statistics and Probability, Base pair, Computation, Computational biology, Biology, computer.software_genre, Biochemistry, 03 medical and health sciences, Relevance (information retrieval), Molecular Biology, Protein secondary structure, 030304 developmental biology, Supplementary data, 0303 health sciences, Sequence Analysis, RNA, 030302 biochemistry & molecular biology, RNA, RNA, RNA three-dimensional structure, structural alignment, Protein tertiary structure, Computer Science Applications, Computational Mathematics, Computational Theory and Mathematics, Benchmark (computing), Nucleic Acid Conformation, Data mining, Sequence Alignment, computer, Algorithms, Software
Abstract: Motivation: Aligning RNAs is useful to search for homologous genes, study evolutionary relationships, detect conserved regions and identify any patterns that may be of biological relevance. Poor levels of conservation among homologs, however, make it difficult to compare RNA sequences, even when considering closely evolutionary related sequences. Results: We describe SARA-Coffee, a tertiary structure-based multiple RNA aligner, which has been validated using BRAliDARTS, a new benchmark framework designed for evaluating tertiary structure–based multiple RNA aligners. We provide two methods to measure the capacity of alignments to match corresponding secondary and tertiary structure features. On this benchmark, SARA-Coffee outperforms both regular aligners and those using secondary structure information. Furthermore, we show that on sequences in which Availability and implementation: The package and the datasets are available from http://www.tcoffee.org/Projects/saracoffee and http://structure.biofold.org/sara/. Contact: cedric.notredame@crg.es Supplementary information: Supplementary data are available at Bioinformatics online

5. Domain Evolution of Vertebrate Blood Coagulation Cascade Proteins.

Author: Coban A, Bornberg-Bauer E, and Kemena C
Subjects: Animals, Vertebrates genetics, Blood Coagulation genetics, Genome, Blood Coagulation Factors genetics, Blood Coagulation Factors metabolism, Chordata genetics
Abstract: Vertebrate blood coagulation is controlled by a cascade containing more than 20 proteins. The cascade proteins are found in the blood in their zymogen forms and when the cascade is triggered by tissue damage, zymogens are activated and in turn activate their downstream proteins by serine protease activity. In this study, we examined proteomes of 21 chordates, of which 18 are vertebrates, to reveal the modular evolution of the blood coagulation cascade. Additionally, two Arthropoda species were used to compare domain arrangements of the proteins belonging to the hemolymph clotting and the blood coagulation cascades. Within the vertebrate coagulation protein set, almost half of the studied proteins are shared with jawless vertebrates. Domain similarity analyses revealed that there are multiple possible evolutionary trajectories for each coagulation protein. During the evolution of higher vertebrate clades, gene and genome duplications led to the formation of other coagulation cascade proteins., (© 2022. The Author(s).)
Published: 2022
Full Text: View/download PDF

6. The modular nature of protein evolution: domain rearrangement rates across eukaryotic life.

Author: Dohmen E, Klasberg S, Bornberg-Bauer E, Perrey S, and Kemena C
Subjects: Animals, Bees physiology, Disease Resistance genetics, Eukaryotic Cells metabolism, Fungi classification, Fungi genetics, Gene Ontology, Mutation physiology, Phylogeny, Plant Diseases microbiology, Social Behavior, Vertebrates classification, Vertebrates genetics, Vertebrates metabolism, Eukaryota genetics, Eukaryota metabolism, Evolution, Molecular, Protein Structure, Tertiary genetics, Proteins chemistry, Proteins genetics
Abstract: Background: Modularity is important for evolutionary innovation. The recombination of existing units to form larger complexes with new functionalities spares the need to create novel elements from scratch. In proteins, this principle can be observed at the level of protein domains, functional subunits which are regularly rearranged to acquire new functions., Results: In this study we analyse the mechanisms leading to new domain arrangements in five major eukaryotic clades (vertebrates, insects, fungi, monocots and eudicots) at unprecedented depth and breadth. This allows, for the first time, to directly compare rates of rearrangements between different clades and identify both lineage specific and general patterns of evolution in the context of domain rearrangements. We analyse arrangement changes along phylogenetic trees by reconstructing ancestral domain content in combination with feasible single step events, such as fusion or fission. Using this approach we explain up to 70% of all rearrangements by tracing them back to their precursors. We find that rates in general and the ratio between these rates for a given clade in particular, are highly consistent across all clades. In agreement with previous studies, fusions are the most frequent event leading to new domain arrangements. A lineage specific pattern in fungi reveals exceptionally high loss rates compared to other clades, supporting recent studies highlighting the importance of loss for evolutionary innovation. Furthermore, our methodology allows us to link domain emergences at specific nodes in the phylogenetic tree to important functional developments, such as the origin of hair in mammals., Conclusions: Our results demonstrate that domain rearrangements are based on a canonical set of mutational events with rates which lie within a relatively narrow and consistent range. In addition, gained knowledge about these rates provides a basis for advanced domain-based methodologies for phylogenetics and homology analysis which complement current sequence-based methods.
Published: 2020
Full Text: View/download PDF

7. DOGMA: a web server for proteome and transcriptome quality assessment.

Author: Kemena C, Dohmen E, and Bornberg-Bauer E
Subjects: Genome, Internet, Molecular Sequence Annotation, Protein Domains, Proteome, Software, Transcriptome
Abstract: Even in the era of next generation sequencing, in which bioinformatics tools abound, annotating transcriptomes and proteomes remains a challenge. This can have major implications for the reliability of studies based on these datasets. Therefore, quality assessment represents a crucial step prior to downstream analyses on novel transcriptomes and proteomes. DOGMA allows such a quality assessment to be carried out. The data of interest are evaluated based on a comparison with a core set of conserved protein domains and domain arrangements. Depending on the studied species, DOGMA offers precomputed core sets for different phylogenetic clades. We now developed a web server for the DOGMA software, offering a user-friendly, simple to use interface. Additionally, the server provides a graphical representation of the analysis results and their placement in comparison to publicly available data. The server is freely available under https://domainworld-services.uni-muenster.de/dogma/. Additionally, for large scale analyses the software can be downloaded free of charge from https://domainworld.uni-muenster.de., (© The Author(s) 2019. Published by Oxford University Press on behalf of Nucleic Acids Research.)
Published: 2019
Full Text: View/download PDF

8. A Roadmap to Domain Based Proteomics.

Author: Kemena C and Bornberg-Bauer E
Subjects: Computational Biology, Molecular Sequence Annotation, Protein Structure, Tertiary, Databases, Protein, Proteomics methods
Abstract: Protein domains are reusable segments of proteins and play an important role in protein evolution. By combining the elements from a relatively small set of domains into unique arrangements, a large number of distinct proteins can be generated. Since domains often have specific functions, changes in their arrangement usually affect the overall protein function. Furthermore, domains are well amenable to computational representations, e.g., by Hidden Markov Models (HMMs), and these HMMs are widely represented in various databases. Therefore, domains can be efficiently used for proteomic analyses. Here, we describe how domains are annotated using different domain databases and then how to assess the annotation quality of proteomes. We next show how functional annotations of domains in large-scale data such as whole genomes or transcriptomes can be used to analyze molecular differences between species. Furthermore, we describe methods to analyze the changes in domain content of proteins which significantly helps to characterize and reconstruct the modular evolution of proteins. Altogether, domain-based methods offer a computationally highly effective approach to analyze large amounts of proteomic data in an evolutionary setting.
Published: 2019
Full Text: View/download PDF

9. Remodeling of the juvenile hormone pathway through caste-biased gene expression and positive selection along a gradient of termite eusociality.

Author: Jongepier E, Kemena C, Lopez-Ezquerra A, Belles X, Bornberg-Bauer E, and Korb J
Subjects: Animals, Blattellidae genetics, Blattellidae growth & development, Evolution, Molecular, Female, Juvenile Hormones biosynthesis, Juvenile Hormones metabolism, Nymph, Social Behavior, Gene Expression Regulation, Developmental, Isoptera genetics, Juvenile Hormones genetics
Abstract: The evolution of division of labor between sterile and fertile individuals represents one of the major transitions in biological complexity. A fascinating gradient in eusociality evolved among the ancient hemimetabolous insects, ranging from noneusocial cockroaches through the primitively social lower termites-where workers retain the ability to reproduce-to the higher termites, characterized by lifetime commitment to worker sterility. Juvenile hormone (JH) is a prime candidate for the regulation of reproductive division of labor in termites, as it plays a key role in insect postembryonic development and reproduction. We compared the expression of JH pathway genes between workers and queens in two lower termites (Zootermopsis nevadensis and Cryptotermes secundus) and a higher termite (Macrotermes natalensis) to that of analogous nymphs and adult females of the noneusocial cockroach Blattella germanica. JH biosynthesis and metabolism genes ranged from reproductive female-biased expression in the cockroach to predominantly worker-biased expression in the lower termites. Remarkably, the expression profile of JH pathway genes sets the higher termite apart from the two lower termites, as well as the cockroach, indicating that JH signaling has undergone major changes in this eusocial termite. These changes go beyond mere shifts in gene expression between the different castes, as we find evidence for positive selection in several termite JH pathway genes. Thus, remodeling of the JH pathway may have played a major role in termite social evolution, representing a striking case of convergent molecular evolution between the termites and the distantly related social hymenoptera., (© 2018 Wiley Periodicals, Inc.)
Published: 2018
Full Text: View/download PDF

10. Hemimetabolous genomes reveal molecular basis of termite eusociality.

Author: Harrison MC, Jongepier E, Robertson HM, Arning N, Bitard-Feildel T, Chao H, Childers CP, Dinh H, Doddapaneni H, Dugan S, Gowin J, Greiner C, Han Y, Hu H, Hughes DST, Huylmans AK, Kemena C, Kremer LPM, Lee SL, Lopez-Ezquerra A, Mallet L, Monroy-Kuhn JM, Moser A, Murali SC, Muzny DM, Otani S, Piulachs MD, Poelchau M, Qu J, Schaub F, Wada-Katsumata A, Worley KC, Xie Q, Ylla G, Poulsen M, Gibbs RA, Schal C, Richards S, Belles X, Korb J, and Bornberg-Bauer E
Subjects: Animals, Biological Evolution, Blattellidae physiology, Isoptera physiology, Phylogeny, Blattellidae genetics, Evolution, Molecular, Genome, Isoptera genetics, Social Behavior
Abstract: Around 150 million years ago, eusocial termites evolved from within the cockroaches, 50 million years before eusocial Hymenoptera, such as bees and ants, appeared. Here, we report the 2-Gb genome of the German cockroach, Blattella germanica, and the 1.3-Gb genome of the drywood termite Cryptotermes secundus. We show evolutionary signatures of termite eusociality by comparing the genomes and transcriptomes of three termites and the cockroach against the background of 16 other eusocial and non-eusocial insects. Dramatic adaptive changes in genes underlying the production and perception of pheromones confirm the importance of chemical communication in the termites. These are accompanied by major changes in gene regulation and the molecular evolution of caste determination. Many of these results parallel molecular mechanisms of eusocial evolution in Hymenoptera. However, the specific solutions are remarkably different, thus revealing a striking case of convergence in one of the major evolutionary transitions in biological complexity.
Published: 2018
Full Text: View/download PDF

11. Multiple sequence alignment modeling: methods and applications.

Author: Chatzou M, Magis C, Chang JM, Kemena C, Bussotti G, Erb I, and Notredame C
Subjects: Algorithms, DNA, Genomics, Proteins, Reproducibility of Results, Sequence Alignment
Abstract: This review provides an overview on the development of Multiple sequence alignment (MSA) methods and their main applications. It is focused on progress made over the past decade. The three first sections review recent algorithmic developments for protein, RNA/DNA and genomic alignments. The fourth section deals with benchmarks and explores the relationship between empirical and simulated data, along with the impact on method developments. The last part of the review gives an overview on available MSA local reliability estimators and their dependence on various algorithmic properties of available methods., (© The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.)
Published: 2016
Full Text: View/download PDF

12. DOGMA: domain-based transcriptome and proteome quality assessment.

Author: Dohmen E, Kremer LP, Bornberg-Bauer E, and Kemena C
Subjects: Computational Biology, Genome, Proteome, Software, Transcriptome
Abstract: Motivation: Genome studies have become cheaper and easier than ever before, due to the decreased costs of high-throughput sequencing and the free availability of analysis software. However, the quality of genome or transcriptome assemblies can vary a lot. Therefore, quality assessment of assemblies and annotations are crucial aspects of genome analysis pipelines., Results: We developed DOGMA, a program for fast and easy quality assessment of transcriptome and proteome data based on conserved protein domains. DOGMA measures the completeness of a given transcriptome or proteome and provides information about domain content for further analysis. DOGMA provides a very fast way to do quality assessment within seconds., Availability and Implementation: DOGMA is implemented in Python and published under GNU GPL v.3 license. The source code is available on https://ebbgit.uni-muenster.de/domainWorld/DOGMA/ CONTACTS: e.dohmen@wwu.de or c.kemena@wwu.de, Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.)
Published: 2016
Full Text: View/download PDF

13. How Do Genomes Create Novel Phenotypes? Insights from the Loss of the Worker Caste in Ant Social Parasites.

Author: Smith CR, Helms Cahan S, Kemena C, Brady SG, Yang W, Bornberg-Bauer E, Eriksson T, Gadau J, Helmkampf M, Gotzek D, Okamoto Miyakawa M, Suarez AV, and Mikheyev A
Subjects: Animals, Biological Evolution, Female, Gene Expression Profiling, Genes, Insect, Genetic Association Studies, Genome Components, Male, Reproduction genetics, Selection, Genetic, Transcriptome, Ants classification, Ants genetics, Behavior, Animal physiology, Social Behavior
Abstract: A central goal of biology is to uncover the genetic basis for the origin of new phenotypes. A particularly effective approach is to examine the genomic architecture of species that have secondarily lost a phenotype with respect to their close relatives. In the eusocial Hymenoptera, queens and workers have divergent phenotypes that may be produced via either expression of alternative sets of caste-specific genes and pathways or differences in expression patterns of a shared set of multifunctional genes. To distinguish between these two hypotheses, we investigated how secondary loss of the worker phenotype in workerless ant social parasites impacted genome evolution across two independent origins of social parasitism in the ant genera Pogonomyrmex and Vollenhovia. We sequenced the genomes of three social parasites and their most-closely related eusocial host species and compared gene losses in social parasites with gene expression differences between host queens and workers. Virtually all annotated genes were expressed to some degree in both castes of the host, with most shifting in queen-worker bias across developmental stages. As a result, despite >1 My of divergence from the last common ancestor that had workers, the social parasites showed strikingly little evidence of gene loss, damaging mutations, or shifts in selection regime resulting from loss of the worker caste. This suggests that regulatory changes within a multifunctional genome, rather than sequence differences, have played a predominant role in the evolution of social parasitism, and perhaps also in the many gains and losses of phenotypes in the social insects., (© The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.)
Published: 2015
Full Text: View/download PDF

14. Domain similarity based orthology detection.

Author: Bitard-Feildel T, Kemena C, Greenwood JM, and Bornberg-Bauer E
Subjects: Humans, Computational Biology methods, Protein Interaction Domains and Motifs, Proteins chemistry, Sequence Homology, Amino Acid, Software
Abstract: Background: Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins., Results: We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison., Conclusion: We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda .
Published: 2015
Full Text: View/download PDF

15. MDAT- Aligning multiple domain arrangements.

Author: Kemena C, Bitard-Feildel T, and Bornberg-Bauer E
Subjects: Humans, Programming Languages, Protein Structure, Tertiary, Algorithms, Proteins chemistry, Sequence Alignment methods, Sequence Analysis, Protein methods, Software
Abstract: Background: Proteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments., Results: We developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method., Conclusion: MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mdat .
Published: 2015
Full Text: View/download PDF

16. Transposable element islands facilitate adaptation to novel environments in an invasive species.

Author: Schrader L, Kim JW, Ence D, Zimin A, Klein A, Wyschetzki K, Weichselgartner T, Kemena C, Stökl J, Schultner E, Wurm Y, Smith CD, Yandell M, Heinze J, Gadau J, and Oettler J
Subjects: Adaptation, Physiological, Animals, Biological Evolution, Brazil, DNA Methylation, Exons, Gene Deletion, Gene Duplication, Japan, Phylogeography, Polymorphism, Single Nucleotide, Ants genetics, DNA Transposable Elements, Genes, Insect, Genome, Insect, Genomic Islands, Introduced Species
Abstract: Adaptation requires genetic variation, but founder populations are generally genetically depleted. Here we sequence two populations of an inbred ant that diverge in phenotype to determine how variability is generated. Cardiocondyla obscurior has the smallest of the sequenced ant genomes and its structure suggests a fundamental role of transposable elements (TEs) in adaptive evolution. Accumulations of TEs (TE islands) comprising 7.18% of the genome evolve faster than other regions with regard to single-nucleotide variants, gene/exon duplications and deletions and gene homology. A non-random distribution of gene families, larvae/adult specific gene expression and signs of differential methylation in TE islands indicate intragenomic differences in regulation, evolutionary rates and coalescent effective population size. Our study reveals a tripartite interplay between TEs, life history and adaptation in an invasive species.
Published: 2014
Full Text: View/download PDF

17. Alignathon: a competitive assessment of whole-genome alignment methods.

Author: Earl D, Nguyen N, Hickey G, Harris RS, Fitzgerald S, Beal K, Seledtsov I, Molodtsov V, Raney BJ, Clawson H, Kim J, Kemena C, Chang JM, Erb I, Poliakov A, Hou M, Herrero J, Kent WJ, Solovyev V, Darling AE, Ma J, Notredame C, Brudno M, Dubchak I, Haussler D, and Paten B
Subjects: Animals, Computational Biology methods, Computer Simulation, Datasets as Topic, Genome-Wide Association Study, Humans, Mammals genetics, Phylogeny, Reproducibility of Results, Genome, Genomics methods, Sequence Alignment methods, Software
Abstract: Multiple sequence alignments (MSAs) are a prerequisite for a wide variety of evolutionary analyses. Published assessments and benchmark data sets for protein and, to a lesser extent, global nucleotide MSAs are available, but less effort has been made to establish benchmarks in the more general problem of whole-genome alignment (WGA). Using the same model as the successful Assemblathon competitions, we organized a competitive evaluation in which teams submitted their alignments and then assessments were performed collectively after all the submissions were received. Three data sets were used: Two were simulated and based on primate and mammalian phylogenies, and one was comprised of 20 real fly genomes. In total, 35 submissions were assessed, submitted by 10 teams using 12 different alignment pipelines. We found agreement between independent simulation-based and statistical assessments, indicating that there are substantial accuracy differences between contemporary alignment tools. We saw considerable differences in the alignment quality of differently annotated regions and found that few tools aligned the duplications analyzed. We found that many tools worked well at shorter evolutionary distances, but fewer performed competitively at longer distances. We provide all data sets, submissions, and assessment programs for further study and provide, as a resource for future benchmarking, a convenient repository of code and data for reproducing the simulation assessments., (© 2014 Earl et al.; Published by Cold Spring Harbor Laboratory Press.)
Published: 2014
Full Text: View/download PDF

18. SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments.

Author: Di Tommaso P, Bussotti G, Kemena C, Capriotti E, Chatzou M, Prieto P, and Notredame C
Subjects: Algorithms, Internet, Nucleic Acid Conformation, RNA chemistry, Sequence Alignment methods, Sequence Analysis, RNA methods, Software
Abstract: This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from http://tcoffee.crg.cat/apps/tcoffee/do:saracoffee., (© The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.)
Published: 2014
Full Text: View/download PDF

19. Using tertiary structure for the computation of highly accurate multiple RNA alignments with the SARA-Coffee package.

Author: Kemena C, Bussotti G, Capriotti E, Marti-Renom MA, and Notredame C
Subjects: Algorithms, Nucleic Acid Conformation, RNA chemistry, Sequence Alignment methods, Sequence Analysis, RNA methods, Software
Abstract: Motivation: Aligning RNAs is useful to search for homologous genes, study evolutionary relationships, detect conserved regions and identify any patterns that may be of biological relevance. Poor levels of conservation among homologs, however, make it difficult to compare RNA sequences, even when considering closely evolutionary related sequences., Results: We describe SARA-Coffee, a tertiary structure-based multiple RNA aligner, which has been validated using BRAliDARTS, a new benchmark framework designed for evaluating tertiary structure-based multiple RNA aligners. We provide two methods to measure the capacity of alignments to match corresponding secondary and tertiary structure features. On this benchmark, SARA-Coffee outperforms both regular aligners and those using secondary structure information. Furthermore, we show that on sequences in which <60% of the nucleotides form base pairs, primary sequence methods usually perform better than secondary-structure aware aligners., Availability and Implementation: The package and the datasets are available from http://www.tcoffee.org/Projects/saracoffee and http://structure.biofold.org/sara/.
Published: 2013
Full Text: View/download PDF

20. Epistasis as the primary factor in molecular evolution.

Author: Breen MS, Kemena C, Vlasov PK, Notredame C, and Kondrashov FA
Subjects: Amino Acid Substitution genetics, Animals, Cell Nucleus genetics, Computational Biology, Genetic Fitness, Genotype, Models, Genetic, Mutation, Organelles genetics, Phylogeny, Proteins chemistry, Proteins genetics, Sequence Alignment, Species Specificity, Epistasis, Genetic genetics, Evolution, Molecular
Abstract: The main forces directing long-term molecular evolution remain obscure. A sizable fraction of amino-acid substitutions seem to be fixed by positive selection, but it is unclear to what degree long-term protein evolution is constrained by epistasis, that is, instances when substitutions that are accepted in one genotype are deleterious in another. Here we obtain a quantitative estimate of the prevalence of epistasis in long-term protein evolution by relating data on amino-acid usage in 14 organelle proteins and 2 nuclear-encoded proteins to their rates of short-term evolution. We studied multiple alignments of at least 1,000 orthologues for each of these 16 proteins from species from a diverse phylogenetic background and found that an average site contained approximately eight different amino acids. Thus, without epistasis an average site should accept two-fifths of all possible amino acids, and the average rate of amino-acid substitutions should therefore be about three-fifths lower than the rate of neutral evolution. However, we found that the measured rate of amino-acid substitution in recent evolution is 20 times lower than the rate of neutral evolution and an order of magnitude lower than that expected in the absence of epistasis. These data indicate that epistasis is pervasive throughout protein evolution: about 90 per cent of all amino-acid substitutions have a neutral or beneficial impact only in the genetic backgrounds in which they occur, and must therefore be deleterious in a different background of other species. Our findings show that most amino-acid substitutions have different fitness effects in different species and that epistasis provides the primary conceptual framework to describe the tempo and mode of long-term protein evolution.
Published: 2012
Full Text: View/download PDF

21. STRIKE: evaluation of protein MSAs using a single 3D structure.

Author: Kemena C, Taly JF, Kleinjung J, and Notredame C
Subjects: Computational Biology methods, Internet, Software, Proteins chemistry, Sequence Alignment methods
Abstract: Motivation: Evaluating alternative multiple protein sequence alignments is an important unsolved problem in Biology. The most accurate way of doing this is to use structural information. Unfortunately, most methods require at least two structures to be embedded in the alignment, a condition rarely met when dealing with standard datasets., Result: We developed STRIKE, a method that determines the relative accuracy of two alternative alignments of the same sequences using a single structure. We validated our methodology on three commonly used reference datasets (BAliBASE, Homestrad and Prefab). Given two alignments, STRIKE manages to identify the most accurate one in 70% of the cases on average. This figure increases to 79% when considering very challenging datasets like the RV11 category of BAliBASE. This discrimination capacity is significantly higher than that reported for other metrics such as Contact Accepted mutation or Blosum. We show that this increased performance results both from a refined definition of the contacts and from the use of an improved contact substitution score., Contact: cedric.notredame@crg.eu, Availability: STRIKE is an open source freeware available from www.tcoffee.org, Supplementary Information: Supplementary data are available at Bioinformatics online.
Published: 2011
Full Text: View/download PDF

22. Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures.

Author: Taly JF, Magis C, Bussotti G, Chang JM, Di Tommaso P, Erb I, Espinosa-Carrasco J, Kemena C, and Notredame C
Subjects: Algorithms, Amino Acid Sequence, Base Sequence, Models, Molecular, Molecular Sequence Data, Software, DNA chemistry, Nucleic Acid Conformation, Proteins chemistry, RNA chemistry, Sequence Alignment methods
Abstract: T-Coffee (Tree-based consistency objective function for alignment evaluation) is a versatile multiple sequence alignment (MSA) method suitable for aligning most types of biological sequences. The main strength of T-Coffee is its ability to combine third party aligners and to integrate structural (or homology) information when building MSAs. The series of protocols presented here show how the package can be used to multiply align proteins, RNA and DNA sequences. The protein section shows how users can select the most suitable T-Coffee mode for their data set. Detailed protocols include T-Coffee, the default mode, M-Coffee, a meta version able to combine several third party aligners into one, PSI (position-specific iterated)-Coffee, the homology extended mode suitable for remote homologs and Expresso, the structure-based multiple aligner. We then also show how the T-RMSD (tree based on root mean square deviation) option can be used to produce a functionally informative structure-based clustering. RNA alignment procedures are described for using R-Coffee, a mode able to use predicted RNA secondary structures when aligning RNA sequences. DNA alignments are illustrated with Pro-Coffee, a multiple aligner specific of promoter regions. We also present some of the many reformatting utilities bundled with T-Coffee. The package is an open-source freeware available from http://www.tcoffee.org/.
Published: 2011
Full Text: View/download PDF

23. Upcoming challenges for multiple sequence alignment methods in the high-throughput era.

Author: Kemena C and Notredame C
Subjects: Amino Acid Sequence, Genome, Molecular Sequence Data, Phylogeny, Sequence Analysis, Protein, Algorithms, Computational Biology methods, Sequence Alignment methods
Abstract: This review focuses on recent trends in multiple sequence alignment tools. It describes the latest algorithmic improvements including the extension of consistency-based methods to the problem of template-based multiple sequence alignments. Some results are presented suggesting that template-based methods are significantly more accurate than simpler alternative methods. The validation of existing methods is also discussed at length with the detailed description of recent results and some suggestions for future validation strategies. The last part of the review addresses future challenges for multiple sequence alignment methods in the genomic era, most notably the need to cope with very large sequences, the need to integrate large amounts of experimental data, the need to accurately align non-coding and non-transcribed sequences and finally, the need to integrate many alternative methods and approaches.
Published: 2009
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

23 results on '"Kemena C"'

1. Alignathon: A competitive assessment of whole-genome alignment methods

2. Enhancing the Scalability of Consistency-based Progressive Multiple Sequences Alignment Applications.

3. Domain similarity based orthology detection

4. Using tertiary structure for the computation of highly accurate multiple RNA alignments with the SARA-Coffee package

5. Domain Evolution of Vertebrate Blood Coagulation Cascade Proteins.

6. The modular nature of protein evolution: domain rearrangement rates across eukaryotic life.

7. DOGMA: a web server for proteome and transcriptome quality assessment.

8. A Roadmap to Domain Based Proteomics.

9. Remodeling of the juvenile hormone pathway through caste-biased gene expression and positive selection along a gradient of termite eusociality.

10. Hemimetabolous genomes reveal molecular basis of termite eusociality.

11. Multiple sequence alignment modeling: methods and applications.

12. DOGMA: domain-based transcriptome and proteome quality assessment.

13. How Do Genomes Create Novel Phenotypes? Insights from the Loss of the Worker Caste in Ant Social Parasites.

14. Domain similarity based orthology detection.

15. MDAT- Aligning multiple domain arrangements.

16. Transposable element islands facilitate adaptation to novel environments in an invasive species.

17. Alignathon: a competitive assessment of whole-genome alignment methods.

18. SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments.

19. Using tertiary structure for the computation of highly accurate multiple RNA alignments with the SARA-Coffee package.

20. Epistasis as the primary factor in molecular evolution.

21. STRIKE: evaluation of protein MSAs using a single 3D structure.

22. Using the T-Coffee package to build multiple sequence alignments of protein, RNA, DNA sequences and 3D structures.

23. Upcoming challenges for multiple sequence alignment methods in the high-throughput era.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

23 results on '"Kemena C"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources