4,428 results on '"structural genomics"'
Search Results
152. A Consensus Method for Ancestral Recombination Graphs.
- Author
-
Kuhner, Mary and Yamato, Jon
- Subjects
- *
FUNCTIONAL genomics , *GENETIC recombination , *STRUCTURAL genomics , *RECOMBINATION (Chemistry) , *INVARIANT manifolds - Abstract
We propose a consensus method for ancestral recombination graphs (ARGs) that generates a single ARG representing commonalities among a cloud of ARGs defined for the same genomic region and set of taxa. Our method, which we call 'threshold consensus,' treats a genomic location as a potential recombination breakpoint only if the number of ARGs in the cloud possessing a breakpoint at that location exceeds a chosen threshold. The estimate is further refined by ignoring recombinations that do not change the local tree topologies, as well as collapsing breakpoint locations separated only by invariant sites. We test the threshold consensus algorithm, using a range of threshold values, on simulated ARGs inferred by a genealogy sampling algorithm, and evaluate accuracy of breakpoints and local topologies in the resulting consensus ARGs. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
153. Common Stress Transcriptome Analysis Reveals Functional and Genomic Architecture Differences Between Early and Delayed Response Genes.
- Author
-
Chung-Wen Lin, Li-Yao Huang, Chao-Li Huang, Yong-Chuan Wang, Pei-Hsuan Lai, Hao-Ven Wang, Wen-Chi Chang, Tzen-Yuh Chiang, and Hao-Jen Huang
- Subjects
- *
ENVIRONMENTAL engineering , *GENETIC transcription , *FERULIC acid , *STRUCTURAL genomics , *GENETIC regulation - Abstract
To identify the similarities among responses to diverse environmental stresses, we analyzed the transcriptome response of rice roots to three rhizotoxic perturbations (chromium, ferulic acid and mercury) and identified common early-transient, early-constant and delayed gene inductions. Common early response genes were mostly associated with signal transduction and hormones, and delayed response genes with lipid metabolism. Network component analysis revealed complicated interactions among common genes, the most highly connected signaling hubs being PP2C68, MPK5, LRR-RLK and NPR1. Gene architecture studies revealed different conserved promoter motifs and a different ratio of CpG island distribution between early and delayed genes. In addition, early-transient genes had more exons and a shorter first exon. IMEter was used to calculate the transcription regulation effects of introns, with greater effects for the first introns of early-transient than delayed genes. The higher Ka/Ks (non-synonymous/synonymous mutation) ratio of early-constant genes than early-transient, delayed and the genome median demonstrates the rapid evolution of early-constant genes. Our results suggest that finely tuned transcriptional control in response to environmental stress in rice depends on genomic architecture and signal intensity and duration. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
154. Blue cheese-making has shaped the population genetic structure of the mould Penicillium roqueforti.
- Author
-
Ropars, Jeanne, López-Villavicencio, Manuela, Snirc, Alodie, Lacoste, Sandrine, and Giraud, Tatiana
- Subjects
- *
CHEESEMAKING , *PENICILLIUM roqueforti , *MOLDS (Fungi) , *FILAMENTOUS fungi , *POPULATION genetics , *FOOD spoilage , *SILAGE - Abstract
Background: Penicillium roqueforti is a filamentous fungus used for making blue cheeses worldwide. It also occurs as a food spoiler and in silage and wood. Previous studies have revealed a strong population genetic structure, with specific traits associated with the different populations. Here, we used a large strain collection from worldwide cheeses published recently to investigate the genetic structure of P. roqueforti. Principal findings: We found a genetic population structure in P. roqueforti that was consistent with previous studies, with two main genetic clusters (W+C+ and W-C-, i.e., with and without horizontal gene transferred regions CheesyTer and Wallaby). In addition, we detected a finer genetic subdivision that corresponded to the environment and to protected designation of origin (PDO), namely the Roquefort PDO. We indeed found evidence for eight genetic clusters, one of the cluster including only strains from other environments than cheeses, and another cluster encompassing only strains from the Roquefort PDO. The W-C- and W+C+ cheese clusters were not the most closely related ones, suggesting that there may have been two independent domestication events of P. roqueforti for making blue cheeses. Significance: The additional population structure revealed here may be relevant for cheese-makers and for understanding the history of domestication in P. roqueforti. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
155. Epistasis Analysis Goes Genome-Wide.
- Author
-
Zhang, Jianzhi
- Subjects
- *
EPISTASIS (Genetics) , *STRUCTURAL genomics , *GENOMES , *ALLELES , *MOLECULAR biology - Abstract
The article discusses how researcher Skwark along with colleagues harnessed population genomic data to approach the challenge of epistasis analysis .Topics discussed include developing a computational method termed genome direct coupling analysis (DCA) to detect epistasis using geno type and allele frequencies estimated from genome sequences of thousands of individuals of the same species.
- Published
- 2017
- Full Text
- View/download PDF
156. The NMR solution structure and function of RPA3313: a putative ribosomal transport protein from Rhodopseudomonas palustris.
- Author
-
Catazaro, Jonathan, Lowe, Austin J., Cerny, Ronald L., and Powers, Robert
- Abstract
Protein function elucidation often relies heavily on amino acid sequence analysis and other bioinformatics approaches. The reliance is extended to structure homology modeling for ligand docking and protein-protein interaction mapping. However, sequence analysis of RPA3313 exposes a large, unannotated class of hypothetical proteins mostly from the Rhizobiales order. In the absence of sequence and structure information, further functional elucidation of this class of proteins has been significantly hindered. A high quality NMR structure of RPA3313 reveals that the protein forms a novel split ββαβ fold with a conserved ligand binding pocket between the first β-strand and the N-terminus of the α-helix. Conserved residue analysis and protein-protein interaction prediction analyses reveal multiple protein binding sites and conserved functional residues. Results of a mass spectrometry proteomic analysis strongly point toward interaction with the ribosome and its subunits. The combined structural and proteomic analyses suggest that RPA3313 by itself or in a larger complex may assist in the transportation of substrates to or from the ribosome for further processing. Proteins 2016; 85:93-102. © 2016 Wiley Periodicals, Inc. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
157. Effects of land use on population presence and genetic structure of an amphibian in an agricultural landscape.
- Author
-
Youngquist, Melissa, Inoue, Kentaro, Boone, Michelle, and Berg, David
- Subjects
SPECIES distribution ,LAND use ,STRUCTURAL genomics ,AMPHIBIANS ,AGRICULTURAL landscape management ,PATCH dynamics - Abstract
Context: Species distributions are a function of an individual's ability to disperse to and colonize habitat patches. These processes depend upon landscape configuration and composition. Objectives: Using Blanchard's cricket frogs ( Acris blanchardi), we assessed which land cover types were predictive of (1) presence at three spatial scales (pond-shed, 500 and 2500 m) and (2) genetic structure. We predicted that forested, urban, and road land covers would negatively affect cricket frogs. We also predicted that agricultural, field, and aquatic land covers would positively affect cricket frogs. Methods: We surveyed for cricket frogs at 28 sites in southwestern Ohio, USA to determine presence across different habitats and analyze genetic structure among populations. For our first objective, we examined if land use (crop, field, forest, and urban habitat) and landscape features (ponds, streams, and roads) explained presence; for our second objective, we assessed whether these land cover types explained genetic distance between populations. Results: Land cover did not have a strong influence on cricket frog presence. However, multiple competing models suggested effects of roads, streams, and land use. We found genetic structuring: populations were grouped into five major clusters and nine finer-scale clusters. Highways were predictive of increased genetic distance. Conclusions: By combining a focal-patch study with landscape genetics, our study suggests that major roads and waterways are key features affecting species distributions in agricultural landscapes. We demonstrate that cricket frogs may respond to landscape features at larger spatial scales, and that presence and movement may be affected by different environmental factors. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
158. SATurn: a modular bioinformatics framework for the design of robust maintainable web-based and standalone applications.
- Author
-
Damerell, David R, Strain-Damerell, Claire, Garsot, Sefa, Joyce, Stephen P, Barrett, Paul, and Marsden, Brian D
- Subjects
- *
BIOINFORMATICS , *RESEARCH grants , *STRUCTURAL genomics , *GLYCOMICS , *CHEMICAL biology - Abstract
Summary SATurn is a modular, open-source, bioinformatics platform designed to specifically address the problems of maintenance and longevity commonly associated with the development of simple tools funded by academic research grants. Applications developed in SATurn can be deployed as web-based tools, standalone applications or hybrid tools which have the benefits of both. Within the Structural Genomics Consortium we have utilized SATurn to create a bioinformatics portal which routinely supports a diverse group of scientists including those interested in structural biology, cloning, glycobiology and chemical biology. Availability and implementation https://github.com/ddamerell53/SATurn Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
159. The Nucleome Data Bank: web-based resources to simulate and analyze the three-dimensional genome
- Author
-
Ryan R. Cheng, Arya Hajitaheri, Erez Lieberman Aiden, Esteban Dodero-Rojas, Matheus F. Mello, Vinícius G. Contessoto, José N. Onuchic, Peter G. Wolynes, Michele Di Pierro, Rice University, Brazilian Center for Research in Energy and Materials-CNPEM, Universidade Estadual Paulista (Unesp), University of Houston, University of Costa Rica, Military Institute of Engineering, Baylor College of Medicine, and Northeastern University
- Subjects
AcademicSubjects/SCI00010 ,Molecular Conformation ,Genomics ,Computational biology ,Biology ,computer.software_genre ,ENCODE ,Genome ,Epigenesis, Genetic ,Structural genomics ,03 medical and health sciences ,0302 clinical medicine ,Databases, Genetic ,Genetics ,Database Issue ,Humans ,Data bank ,Web application ,In Situ Hybridization, Fluorescence ,030304 developmental biology ,Internet ,0303 health sciences ,Genome, Human ,business.industry ,Computational Biology ,Molecular Sequence Annotation ,Pipeline (software) ,Chromatin ,3. Good health ,Shared resource ,A549 Cells ,%22">Fish ,Data mining ,business ,computer ,Software ,030217 neurology & neurosurgery - Abstract
Made available in DSpace on 2021-06-25T10:15:21Z (GMT). No. of bitstreams: 0 Previous issue date: 2021-01-08 Welch Foundation National Science Foundation We introduce the Nucleome Data Bank (NDB), a web-based platform to simulate and analyze the three-dimensional (3D) organization of genomes. The NDB enables physics-based simulation of chromosomal structural dynamics through the MEGABASE + MiChroM computational pipeline. The input of the pipeline consists of epigenetic information sourced from the Encode database; the output consists of the trajectories of chromosomal motions that accurately predict Hi-C and fluorescence insitu hybridization data, as well as multiple observations of chromosomal dynamics in vivo. As an intermediate step, users can also generate chromosomal sub-compartment annotations directly from the same epigenetic input, without the use of any DNA-DNA proximity ligation data. Additionally, the NDB freely hosts both experimental and computational structural genomics data. Besides being able to perform their own genome simulations and download the hosted data, users can also analyze and visualize the same data through custom-designed web-based tools. In particular, the one-dimensional genetic and epigenetic data can be overlaid onto accurate 3D structures of chromosomes, to study the spatial distribution of genetic and epigenetic features. The NDB aims to be a shared resource to biologists, biophysicists and all genome scientists. The NDB is available at https://ndb.rice.edu. Center for Theoretical Biological Physics Rice University Brazilian Biorenewables National Laboratory-LNBR Brazilian Center for Research in Energy and Materials-CNPEM Department of Physics São Paulo State University (UNESP) Institute of Biosciences Humanities and Exact Sciences Department of Computer Science University of Houston Theoretical and Computational Physics Laboratory University of Costa Rica Chemical Engineering Department Military Institute of Engineering The Center for Genome Architecture Department of Molecular and Human Genetics Baylor College of Medicine Department of Physics and Astronomy Rice University Department of Chemistry Rice University Department of Biosciences Rice University Department of Physics Northeastern University Department of Physics São Paulo State University (UNESP) Institute of Biosciences Humanities and Exact Sciences Welch Foundation: C-0016 Welch Foundation: C-1792 National Science Foundation: CHE-1614101 National Science Foundation: PHY-2019745
- Published
- 2020
- Full Text
- View/download PDF
160. Crystal structure of Nsp15 endoribonuclease <scp>NendoU</scp> from <scp>SARS‐CoV</scp> ‐2
- Author
-
M. Wilamowski, Youngchang Kim, Robert Jedrzejczak, Natalia Maltseva, Karolina Michalska, Andrzej Joachimiak, Adam Godzik, and Matthias Endres
- Subjects
Models, Molecular ,Protein Conformation, alpha-Helical ,viruses ,Oligonucleotides ,Gene Expression ,Nsp15 ,Plasma protein binding ,Viral Nonstructural Proteins ,Crystallography, X-Ray ,medicine.disease_cause ,Biochemistry ,SARS‐CoV‐2 ,Substrate Specificity ,EndoU family ,Catalytic Domain ,Cloning, Molecular ,skin and connective tissue diseases ,Peptide sequence ,0303 health sciences ,030302 biochemistry & molecular biology ,NendoU ,virus diseases ,Articles ,Recombinant Proteins ,Severe acute respiratory syndrome-related coronavirus ,Middle East Respiratory Syndrome Coronavirus ,Protein Binding ,crystal structure ,Middle East respiratory syndrome coronavirus ,Viral protein ,endoribonuclease ,Genetic Vectors ,Endoribonuclease ,Biology ,Article ,Virus ,Structural genomics ,Betacoronavirus ,03 medical and health sciences ,COVID‐19 ,Endoribonucleases ,Escherichia coli ,medicine ,Humans ,Protein Interaction Domains and Motifs ,Amino Acid Sequence ,Molecular Biology ,030304 developmental biology ,Sequence Homology, Amino Acid ,SARS-CoV-2 ,fungi ,biology.organism_classification ,Virology ,body regions ,Protein Conformation, beta-Strand ,Sequence Alignment - Abstract
Severe Acute Respiratory Syndrome coronavirus 2 (SARS‐CoV‐2) is rapidly spreading around the world. There is no existing vaccine or proven drug to prevent infections and stop virus proliferation. Although this virus is similar to human and animal SARS‐CoVs and Middle East Respiratory Syndrome coronavirus (MERS‐CoVs), the detailed information about SARS‐CoV‐2 proteins structures and functions is urgently needed to rapidly develop effective vaccines, antibodies, and antivirals. We applied high‐throughput protein production and structure determination pipeline at the Center for Structural Genomics of Infectious Diseases to produce SARS‐CoV‐2 proteins and structures. Here we report two high‐resolution crystal structures of endoribonuclease Nsp15/NendoU. We compare these structures with previously reported homologs from SARS and MERS coronaviruses.
- Published
- 2020
- Full Text
- View/download PDF
161. Toward a structome of<scp>Acinetobacter baumannii</scp>drug targets
- Author
-
Lynn K. Barrett, Logan Tillery, David M. Dranow, Donald D. Lorimer, Jan Abendroth, Kayleigh F. Barrett, Sandhya Subramanian, Isabelle Q. Phan, Roger Shek, Ian Chun, Justin K. Craig, Wesley C. Van Voorhis, and Thomas E. Edwards
- Subjects
Acinetobacter baumannii ,Models, Molecular ,Protein Conformation ,Full‐length Papers ,Methionine-tRNA Ligase ,Computational biology ,Biology ,Biochemistry ,Structural genomics ,03 medical and health sciences ,Antibiotic resistance ,Bacterial Proteins ,Drug Resistance, Bacterial ,Humans ,Uroporphyrinogen Decarboxylase ,Molecular Biology ,Gene ,030304 developmental biology ,0303 health sciences ,Coproporphyrinogen Oxidase ,030302 biochemistry & molecular biology ,computer.file_format ,Protein Data Bank ,biology.organism_classification ,Anti-Bacterial Agents ,Essential gene ,Infectious disease (medical specialty) ,Transposon mutagenesis ,computer ,Genome, Bacterial - Abstract
Acinetobacter baumannii is well known for causing hospital‐associated infections due in part to its intrinsic antibiotic resistance as well as its ability to remain viable on surfaces and resist cleaning agents. In a previous publication, A. baumannii strain AB5075 was studied by transposon mutagenesis and 438 essential gene candidates for growth on rich‐medium were identified. The Seattle Structural Genomics Center for Infectious Disease entered 342 of these candidate essential genes into our pipeline for structure determination, in which 306 were successfully cloned into expression vectors, 192 were detectably expressed, 165 screened as soluble, 121 were purified, 52 crystalized, 30 provided diffraction data, and 29 structures were deposited in the Protein Data Bank. Here, we report these structures, compare them with human orthologs where applicable, and discuss their potential as drug targets for antibiotic development against A. baumannii.
- Published
- 2020
- Full Text
- View/download PDF
162. Detecting “protein words” through unsupervised word segmentation [version 1; referees: 1 approved with reservations, 1 not approved]
- Author
-
Wang Liang and Zhao Kaiyong
- Subjects
Method Article ,Articles ,Protein Chemistry & Proteomics ,Structural Genomics ,Theory & Simulation ,Word segmentation ,Protein sequence ,Protein secondary structure ,Unsupervised method ,Soft counting ,Protein word ,Gene finding ,Description length - Abstract
Unsupervised word segmentation methods were applied to analyze protein sequences. Protein sequences, such as “MTMDKSELVQKA…,” were used as input to these methods. Segmented protein word sequences, such as “MTM DKSE LVQKA,” were then obtained. We compared the protein words derived via unsupervised segmentation and protein secondary structure segmentation. An interesting finding is that unsupervised word segmentation is more efficient than secondary structure segmentation in expressing information. Our experiment also suggests the presence of several “protein ruins” in current non-coding regions.
- Published
- 2015
- Full Text
- View/download PDF
163. Computer-aided drug discovery [version 1; referees: 3 approved]
- Author
-
Jürgen Bajorath
- Subjects
Review ,Articles ,Biocatalysis ,Bioinformatics ,Biomacromolecule-Ligand Interactions ,Drug Discovery & Design ,Experimental Biophysical Methods ,Genomics ,Macromolecular Chemistry ,Medical Genetics ,Molecular Pharmacology ,Neuropharmacology & Psychopharmacology ,Pharmacokinetics & Drug Delivery ,Protein Folding ,Small Molecule Chemistry ,Structural Genomics ,Theory & Simulation ,Toxicology ,computational ,drug discovery ,in silico - Abstract
Computational approaches are an integral part of interdisciplinary drug discovery research. Understanding the science behind computational tools, their opportunities, and limitations is essential to make a true impact on drug discovery at different levels. If applied in a scientifically meaningful way, computational methods improve the ability to identify and evaluate potential drug molecules, but there remain weaknesses in the methods that preclude naïve applications. Herein, current trends in computer-aided drug discovery are reviewed, and selected computational areas are discussed. Approaches are highlighted that aid in the identification and optimization of new drug candidates. Emphasis is put on the presentation and discussion of computational concepts and methods, rather than case studies or application examples. As such, this contribution aims to provide an overview of the current methodological spectrum of computational drug discovery for a broad audience.
- Published
- 2015
- Full Text
- View/download PDF
164. Resources, challenges and way forward in rare mitochondrial diseases research [version 2; referees: 2 approved]
- Author
-
Neeraj Kumar Rajput, Vipin Singh, and Anshu Bhardwaj
- Subjects
Review ,Articles ,Genomics ,Structural Genomics ,rare disease ,mitochondria ,mitochondrial DNA ,genome variation ,crowdsourcing ,crowdfunding ,semantic-web ,next generation sequencing - Abstract
Over 300 million people are affected by about 7000 rare diseases globally. There are tremendous resource limitations and challenges in driving research and drug development for rare diseases. Hence, innovative approaches are needed to identify potential solutions. This review focuses on the resources developed over the past years for analysis of genome data towards understanding disease biology especially in the context of mitochondrial diseases, given that mitochondria are central to major cellular pathways and their dysfunction leads to a broad spectrum of diseases. Platforms for collaboration of research groups, clinicians and patients and the advantages of community collaborative efforts in addressing rare diseases are also discussed. The review also describes crowdsourcing and crowdfunding efforts in rare diseases research and how the upcoming initiatives for understanding disease biology including analyses of large number of genomes are also applicable to rare diseases.
- Published
- 2015
- Full Text
- View/download PDF
165. Villain of Molecular Biology: Why are we not reproducible in research? [version 1; referees: 2 approved]
- Author
-
Vikash Bhardwaj
- Subjects
Opinion Article ,Articles ,Animal Genetics ,Evolutionary/Comparative Genetics ,Genomics ,Structural Genomics ,Reproducibility ,Irreproducibility ,Molecular biology ,DNA ,Research - Abstract
Worldwide, there is an issue of irreproducibility in life science research. In the USA alone $28 billion per year spent on preclinical research is not reproducible. Within this opinion article, I provide a brief historical account of the discovery of the Watson-Crick DNA model and introduce another neglected model of DNA. This negligence may be one of the fundamental reasons behind irreproducibility in molecular biology research.
- Published
- 2015
- Full Text
- View/download PDF
166. A bioinformatics insight to rhizobial globins: gene identification and mapping, polypeptide sequence and phenetic analysis, and protein modeling. [version 1; referees: 2 approved]
- Author
-
Reinier Gesto-Borroto, Miriam Sánchez-Sánchez, and Raúl Arredondo-Peter
- Subjects
Research Article ,Articles ,Bioinformatics ,Structural Genomics ,Structure: Transcription & Translation ,Burkholderia ,Cupriavidus ,flavohemoglobin ,globin-coupled sensor ,Rhizobium ,single-domain globin ,truncated (2/2) hemoglobin - Abstract
Globins (Glbs) are proteins widely distributed in organisms. Three evolutionary families have been identified in Glbs: the M, S and T Glb families. The M Glbs include flavohemoglobins (fHbs) and single-domain Glbs (SDgbs); the S Glbs include globin-coupled sensors (GCSs), protoglobins and sensor single domain globins, and the T Glbs include truncated Glbs (tHbs). Structurally, the M and S Glbs exhibit 3/3-folding whereas the T Glbs exhibit 2/2-folding. Glbs are widespread in bacteria, including several rhizobial genomes. However, only few rhizobial Glbs have been characterized. Hence, we characterized Glbs from 62 rhizobial genomes using bioinformatics methods such as data mining in databases, sequence alignment, phenogram construction and protein modeling. Also, we analyzed soluble extracts from Bradyrhizobium japonicum USDA38 and USDA58 by (reduced + carbon monoxide (CO) minus reduced) differential spectroscopy. Database searching showed that only fhb, sdgb, gcs and thb genes exist in the rhizobia analyzed in this work. Promoter analysis revealed that apparently several rhizobial glb genes are not regulated by a -10 promoter but might be regulated by -35 and Fnr (fumarate-nitrate reduction regulator)-like promoters. Mapping analysis revealed that rhizobial fhbs and thbs are flanked by a variety of genes whereas several rhizobial sdgbs and gcss are flanked by genes coding for proteins involved in the metabolism of nitrates and nitrites and chemotaxis, respectively. Phenetic analysis showed that rhizobial Glbs segregate into the M, S and T Glb families, while structural analysis showed that predicted rhizobial SDgbs and fHbs and GCSs globin domain and tHbs fold into the 3/3- and 2/2-folding, respectively. Spectra from B. japonicum USDA38 and USDA58 soluble extracts exhibited peaks and troughs characteristic of bacterial and vertebrate Glbs thus indicating that putative Glbs are synthesized in B. japonicum USDA38 and USDA58.
- Published
- 2015
- Full Text
- View/download PDF
167. Resources, challenges and way forward in rare mitochondrial diseases research [version 1; referees: 2 approved]
- Author
-
Neeraj Kumar Rajput, Vipin Singh, and Anshu Bhardwaj
- Subjects
Review ,Articles ,Genomics ,Structural Genomics ,rare disease ,mitochondria ,mitochondrial DNA ,genome variation ,crowdsourcing ,crowdfunding ,semantic-web ,next generation sequencing - Abstract
Over 300 million people are affected by about 7000 rare diseases globally. There are tremendous resource limitations and challenges in driving research and drug development for rare diseases. Hence, innovative approaches are needed to identify potential solutions. This review focuses on the resources developed over the past years for analysis of genome data towards understanding disease biology especially in the context of mitochondrial diseases, given that mitochondria are central to major cellular pathways and their dysfunction leads to a broad spectrum of diseases. Platforms for collaboration of research groups, clinicians and patients and the advantages of community collaborative efforts in addressing rare diseases are also discussed. The review also describes crowdsourcing and crowdfunding efforts in rare diseases research and how the upcoming initiatives for understanding disease biology including analyses of large number of genomes are also applicable to rare diseases.
- Published
- 2015
- Full Text
- View/download PDF
168. Mutational landscape of SARS-CoV-2 genome in Turkey and impact of mutations on spike protein structure
- Author
-
Ozden Hatirnaz Ng, Sezer Akyoney, Ilayda Sahin, Huseyin Okan Soykam, Gunseli Bayram Akcapinar, Ozkan Ozdemir, Derya Dilek Kancagi, Gozde Sir Karakus, Bulut Yurtsever, Ayse Sesin Kocagoz, Ercument Ovali, Ugur Ozbek, and Acibadem University Dspace
- Subjects
RNA viruses ,Male ,Turkey ,Coronaviruses ,Gene Identification and Analysis ,Biochemistry ,Geographical Locations ,Database and Informatics Methods ,Biochemical Simulations ,Macromolecular Structure Analysis ,Pathology and laboratory medicine ,Phylogeny ,Viral Genomics ,Multidisciplinary ,Genomics ,Medical microbiology ,Middle Aged ,Turkey (Country) ,Europe ,Viruses ,Spike Glycoprotein, Coronavirus ,Viral Genome ,Medicine ,Female ,SARS CoV 2 ,Pathogens ,Algorithms ,Research Article ,Adult ,Protein Structure ,Asia ,SARS coronavirus ,Science ,Microbial Genomics ,Genome, Viral ,Molecular Dynamics Simulation ,Research and Analysis Methods ,Microbiology ,Young Adult ,Protein Domains ,Virology ,Genetics ,Humans ,Molecular Biology ,Mutation Detection ,Aged ,Medicine and health sciences ,Biology and life sciences ,SARS-CoV-2 ,Organisms ,Viral pathogens ,Computational Biology ,Proteins ,COVID-19 ,Microbial pathogens ,Biological Databases ,People and Places ,Mutation Databases ,Mutation ,Structural Genomics - Abstract
The Coronavirus Disease 2019 (COVID-19) was declared a pandemic in March 2020 by the World Health Organization (WHO). As of May 25th, 2021 there were 2.059.941 SARS-COV2 genome sequences that have been submitted to the GISAID database, with numerous variations. Here, we aim to analyze the SARS-CoV-2 genome data submitted to the GISAID database from Turkey and to determine the variant and clade distributions by the end of May 2021, in accordance with their appearance timeline. We compared these findings to USA, Europe, and Asia data as well. We have also evaluated the effects of spike protein variations, detected in a group of genome sequences of 13 patients who applied to our clinic, by using 3D modeling algorithms. For this purpose, we analyzed 4607 SARS-CoV-2 genome sequences submitted by different lab centers from Turkey to the GISAID database between March 2020 and May 2021. Described mutations were also introducedin silicoto the spike protein structure to analyze their isolated impacts on the protein structure. The most abundant clade was GR followed by G, GH, and GRY and we did not detect any V clade. The most common variant was B.1, followed by B.1.1, and the UK variant, B.1.1.7. Our results clearly show a concordance between the variant distributions, the number of cases, and the timelines of different variant accumulations in Turkey. The 3D simulations indicate an increase in the surface hydrophilicity of the reference spike protein and the detected mutations. There was less surface hydrophilicity increase in the Asp614Gly mutation, which exhibits a more compact conformation around the ACE-2 receptor binding domain region, rendering the structure in a “down” conformation. Our genomic findings can help to model vaccination programs and protein modeling may lead to different approaches for COVID-19 treatment strategies.
- Published
- 2021
169. Novel Algorithm for Improved Protein Classification Using Graph Similarity
- Author
-
Chin-Wei Hsu, Sun-Yuan Hsieh, Kai-Hsun Yao, Hsin-Hung Chou, Ching-Tien Hsu, and Hao-Ching Wang
- Subjects
Computer science ,business.industry ,Applied Mathematics ,Genome project ,Graph similarity ,Vertex (geometry) ,Structural genomics ,Task (computing) ,ComputingMethodologies_PATTERNRECOGNITION ,Protein structure ,Data sequences ,Software ,Genetics ,business ,Algorithm ,Biotechnology - Abstract
Considerable sequence data are produced in genome annotation projects that relate to molecular levels, structural similarities, and molecular and biological functions. In structural genomics, the most essential task involves resolving protein structures efficiently with hardware or software, understanding these structures, and assigning their biological functions. Understanding the characteristics and functions of proteins enables the exploration of the molecular mechanisms of life. In this paper, we examine the problems of protein classification. Because they perform similar biological functions, proteins in the same family usually share similar structural characteristics. We employed this premise in designing a classification algorithm. In this algorithm, auxiliary graphs are used to represent proteins, with every amino acid in a protein to a vertex in a graph. Moreover, the links between amino acids correspond to the edges between the vertices. The proposed algorithm classifies proteins according to the similarities in their graphical structures. The proposed algorithm is efficient and accurate in distinguishing proteins from different families and outperformed related algorithms experimentally.
- Published
- 2021
170. Resources, challenges and way forward in rare mitochondrial diseases research [v2; ref status: indexed, http://f1000r.es/5r6]
- Author
-
Neeraj Kumar Rajput, Vipin Singh, and Anshu Bhardwaj
- Subjects
Genomics ,Structural Genomics ,Medicine ,Science - Abstract
Over 300 million people are affected by about 7000 rare diseases globally. There are tremendous resource limitations and challenges in driving research and drug development for rare diseases. Hence, innovative approaches are needed to identify potential solutions. This review focuses on the resources developed over the past years for analysis of genome data towards understanding disease biology especially in the context of mitochondrial diseases, given that mitochondria are central to major cellular pathways and their dysfunction leads to a broad spectrum of diseases. Platforms for collaboration of research groups, clinicians and patients and the advantages of community collaborative efforts in addressing rare diseases are also discussed. The review also describes crowdsourcing and crowdfunding efforts in rare diseases research and how the upcoming initiatives for understanding disease biology including analyses of large number of genomes are also applicable to rare diseases.
- Published
- 2015
- Full Text
- View/download PDF
171. Computer-aided drug discovery [v1; ref status: indexed, http://f1000r.es/5ij]
- Author
-
Jürgen Bajorath
- Subjects
Biocatalysis ,Bioinformatics ,Biomacromolecule-Ligand Interactions ,Drug Discovery & Design ,Experimental Biophysical Methods ,Genomics ,Macromolecular Chemistry ,Medical Genetics ,Molecular Pharmacology ,Neuropharmacology & Psychopharmacology ,Pharmacokinetics & Drug Delivery ,Protein Folding ,Small Molecule Chemistry ,Structural Genomics ,Theory & Simulation ,Toxicology ,Medicine ,Science - Abstract
Computational approaches are an integral part of interdisciplinary drug discovery research. Understanding the science behind computational tools, their opportunities, and limitations is essential to make a true impact on drug discovery at different levels. If applied in a scientifically meaningful way, computational methods improve the ability to identify and evaluate potential drug molecules, but there remain weaknesses in the methods that preclude naïve applications. Herein, current trends in computer-aided drug discovery are reviewed, and selected computational areas are discussed. Approaches are highlighted that aid in the identification and optimization of new drug candidates. Emphasis is put on the presentation and discussion of computational concepts and methods, rather than case studies or application examples. As such, this contribution aims to provide an overview of the current methodological spectrum of computational drug discovery for a broad audience.
- Published
- 2015
- Full Text
- View/download PDF
172. Villain of Molecular Biology: Why are we not reproducible in research? [v1; ref status: indexed, http://f1000r.es/5ou]
- Author
-
Vikash Bhardwaj
- Subjects
Animal Genetics ,Evolutionary/Comparative Genetics ,Genomics ,Structural Genomics ,Medicine ,Science - Abstract
Worldwide, there is an issue of irreproducibility in life science research. In the USA alone $28 billion per year spent on preclinical research is not reproducible. Within this opinion article, I provide a brief historical account of the discovery of the Watson-Crick DNA model and introduce another neglected model of DNA. This negligence may be one of the fundamental reasons behind irreproducibility in molecular biology research.
- Published
- 2015
- Full Text
- View/download PDF
173. A bioinformatics insight to rhizobial globins: gene identification and mapping, polypeptide sequence and phenetic analysis, and protein modeling. [v1; ref status: indexed, http://f1000r.es/5ai]
- Author
-
Reinier Gesto-Borroto, Miriam Sánchez-Sánchez, and Raúl Arredondo-Peter
- Subjects
Bioinformatics ,Structural Genomics ,Structure: Transcription & Translation ,Medicine ,Science - Abstract
Globins (Glbs) are proteins widely distributed in organisms. Three evolutionary families have been identified in Glbs: the M, S and T Glb families. The M Glbs include flavohemoglobins (fHbs) and single-domain Glbs (SDgbs); the S Glbs include globin-coupled sensors (GCSs), protoglobins and sensor single domain globins, and the T Glbs include truncated Glbs (tHbs). Structurally, the M and S Glbs exhibit 3/3-folding whereas the T Glbs exhibit 2/2-folding. Glbs are widespread in bacteria, including several rhizobial genomes. However, only few rhizobial Glbs have been characterized. Hence, we characterized Glbs from 62 rhizobial genomes using bioinformatics methods such as data mining in databases, sequence alignment, phenogram construction and protein modeling. Also, we analyzed soluble extracts from Bradyrhizobium japonicum USDA38 and USDA58 by (reduced + carbon monoxide (CO) minus reduced) differential spectroscopy. Database searching showed that only fhb, sdgb, gcs and thb genes exist in the rhizobia analyzed in this work. Promoter analysis revealed that apparently several rhizobial glb genes are not regulated by a -10 promoter but might be regulated by -35 and Fnr (fumarate-nitrate reduction regulator)-like promoters. Mapping analysis revealed that rhizobial fhbs and thbs are flanked by a variety of genes whereas several rhizobial sdgbs and gcss are flanked by genes coding for proteins involved in the metabolism of nitrates and nitrites and chemotaxis, respectively. Phenetic analysis showed that rhizobial Glbs segregate into the M, S and T Glb families, while structural analysis showed that predicted rhizobial SDgbs and fHbs and GCSs globin domain and tHbs fold into the 3/3- and 2/2-folding, respectively. Spectra from B. japonicum USDA38 and USDA58 soluble extracts exhibited peaks and troughs characteristic of bacterial and vertebrate Glbs thus indicating that putative Glbs are synthesized in B. japonicum USDA38 and USDA58.
- Published
- 2015
- Full Text
- View/download PDF
174. Resources, challenges and way forward in rare mitochondrial diseases research [v1; ref status: indexed, http://f1000r.es/54x]
- Author
-
Anshu Bhardwaj, Neeraj Kumar Rajput, and Vipin Singh
- Subjects
Genomics ,Structural Genomics ,Medicine ,Science - Abstract
Over 300 million people are affected by about 7000 rare diseases globally. There are tremendous resource limitations and challenges in driving research and drug development for rare diseases. Hence, innovative approaches are needed to identify potential solutions. This review focuses on the resources developed over the past years for analysis of genome data towards understanding disease biology especially in the context of mitochondrial diseases, given that mitochondria are central to major cellular pathways and their dysfunction leads to a broad spectrum of diseases. Platforms for collaboration of research groups, clinicians and patients and the advantages of community collaborative efforts in addressing rare diseases are also discussed. The review also describes crowdsourcing and crowdfunding efforts in rare diseases research and how the upcoming initiatives for understanding disease biology including analyses of large number of genomes are also applicable to rare diseases.
- Published
- 2015
- Full Text
- View/download PDF
175. Inferential Structure Determination of Chromosomes from Single-Cell Hi-C Data.
- Author
-
Carstens, Simeon, Nilges, Michael, and Habeck, Michael
- Subjects
- *
CHROMOSOME structure , *SINGLE cell proteins , *INFERENTIAL statistics , *GENOMES , *THREE-dimensional display systems - Abstract
Chromosome conformation capture (3C) techniques have revealed many fascinating insights into the spatial organization of genomes. 3C methods typically provide information about chromosomal contacts in a large population of cells, which makes it difficult to draw conclusions about the three-dimensional organization of genomes in individual cells. Recently it became possible to study single cells with Hi-C, a genome-wide 3C variant, demonstrating a high cell-to-cell variability of genome organization. In principle, restraint-based modeling should allow us to infer the 3D structure of chromosomes from single-cell contact data, but suffers from the sparsity and low resolution of chromosomal contacts. To address these challenges, we adapt the Bayesian Inferential Structure Determination (ISD) framework, originally developed for NMR structure determination of proteins, to infer statistical ensembles of chromosome structures from single-cell data. Using ISD, we are able to compute structural error bars and estimate model parameters, thereby eliminating potential bias imposed by ad hoc parameter choices. We apply and compare different models for representing the chromatin fiber and for incorporating singe-cell contact information. Finally, we extend our approach to the analysis of diploid chromosome data. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
176. Structural and functional characterization of a cold-adapted stand-alone TPM domain reveals a relationship between dynamics and phosphatase activity.
- Author
-
Pellizza, Leonardo A., Smal, Clara, Ithuralde, Raúl E., Turjanski, Adrián G., Cicero, Daniel O., and Arán, Martín
- Subjects
- *
PHOSPHATASES , *ENZYME activation , *HYDROLASES , *DYNAMICS , *CRYSTAL structure - Abstract
The TPM domain constitutes a family of recently characterized protein domains that are present in most living organisms. Although some progress has been made in understanding the cellular role of TPM-containing proteins, the relationship between structure and function is not clear yet. We have recently solved the solution and crystal structure of one TPM domain ( BA42) from the Antarctic bacterium Bizionia argentinensis. In this work, we demonstrate that BA42 has phosphoric-monoester hydrolase activity. The activity of BA42 is strictly dependent on the binding of divalent metals and retains nearly 70% of the maximum at 4 °C, a typical characteristic of cold-adapted enzymes. From HSQC, 15N relaxation measurements, and molecular dynamics studies, we determine that the flexibility of the crossing loops was associated to the protein activity. Thermal unfolding experiments showed that the local increment in flexibility of Mg2+-bound BA42, when compared with Ca2+-bound BA42, is associated to a decrease in global protein stability. Finally, through mutagenesis experiments, we unambiguously demonstrate that the region comprising the metal-binding site participates in the catalytic mechanism. The results shown here contribute to the understanding of the relationship between structure and function of this new family of TPM domains providing important cues on the regulatory role of Mg2+ and Ca2+ and the molecular mechanism underlying enzyme activity at low temperatures. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
177. An Algorithm for Finding the Singleton Attractors and Pre-Images in Strong-Inhibition Boolean Networks.
- Author
-
He, Zhiwei, Zhan, Meng, Liu, Shuai, Fang, Zebo, and Yao, Chenggui
- Subjects
- *
GENETIC algorithms , *GENE regulatory networks , *IMAGE analysis , *BOOLEAN functions , *ATTRACTORS (Mathematics) - Abstract
The detection of the singleton attractors is of great significance for the systematic study of genetic regulatory network. In this paper, we design an algorithm to compute the singleton attractors and pre-images of the strong-inhibition Boolean networks which is a biophysically plausible gene model. Our algorithm can not only identify accurately the singleton attractors, but also find easily the pre-images of the network. Based on extensive computational experiments, we show that the computational time of the algorithm is proportional to the number of the singleton attractors, which indicates the algorithm has much advantage in finding the singleton attractors for the networks with high average degree and less inhibitory interactions. Our algorithm may shed light on understanding the function and structure of the strong-inhibition Boolean networks. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
178. Pinpointing genes underlying annual/perennial transitions with comparative genomics.
- Author
-
Heidel, Andrew J., Kiefer, Christiane, Coupland, George, and Rose, Laura E.
- Subjects
- *
MOLECULAR genetics , *DNA , *GENOMICS , *ARABIDOPSIS thaliana , *STRUCTURAL genomics - Abstract
Background: Transitions between perennial and an annual life history occur often in plant lineages, but the genes that control whether a plant is an annual or perennial are largely unknown. To identify genes that confer differences between annuals and perennials we compared the gene content of four pairs of sister lineages (Arabidopsis thaliana/ Arabidopsis lyrata, Arabis montbretiana/Arabis alpina, Arabis verna/Aubrieta parviflora and Draba nemorosa/Draba hispanica) in the Brassicaceae in which each pair contains one annual and one perennial, plus one extra annual species (Capsella rubella). Results: After sorting all genes in all nine species into gene families, we identified five families in which well-annotated genes are present in the perennials A. lyrata and A. alpina, but are not present in any of the annual species. For the eleven genes in perennials in these families, an orthologous pseudogene or otherwise highly diverged gene was found in the syntenic region of the annual species in six cases. The five candidate families identified encode: a kinase, an oxidoreductase, a lactoylglutathione lyase, a F-box protein and a zinc finger protein. By comparing the active gene in the perennial to the pseudogene or heavily altered gene in the annual, dN and dS were calculated. The low dN/dS values in one kinase suggest that it became pseudogenized more recently, while the other kinase, F-box, oxidoreductase and zinc-finger became pseudogenized closer to the divergence between the annual-perennial pair. Conclusions: We identified five gene families that may be involved in the life history switch from perennial to annual. Considering the dN and dS data and whether syntenic pseudogenes were found and the potential functions of the genes, the F-box family is considered the most promising candidate for future functional studies to determine if it affects life history. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
179. NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents.
- Author
-
Liu, Sophia S., Hockenberry, Adam J., Lancichinetti, Andrea, Jewett, Michael C., and Amaral, Luís A. N.
- Subjects
- *
AMINO acid biotechnology , *AMINO compounds , *LIGANDS (Biochemistry) , *ORGANIC acids , *GENOMICS - Abstract
The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
180. Structural basis for the recognition of spliceosomal SmN/B/B' proteins by the RBM5 OCRE domain in splicing regulation.
- Author
-
Mourão, André, Bonnal, Sophie, Soni, Komal, Warner, Lisa, Bordonné, Rémy, Valcárcel, Juan, and Sattler, Michael
- Subjects
- *
SPLICEOSOMES , *RNA splicing , *STRUCTURAL genomics , *APOPTOSIS , *BIOCHEMISTRY , *GENETIC regulation , *GENETICS - Abstract
The multi-domain splicing factor RBM5 regulates the balance between antagonistic isoforms of the apoptosis-control genes FAS/CD95, Caspase-2 and AID. An OCRE (OCtamer REpeat of aromatic residues) domain found in RBM5 is important for alternative splicing regulation and mediates interactions with components of the U4/U6.U5 tri-snRNP. We show that the RBM5 OCRE domain adopts a unique b-sheet fold. NMR and biochemical experiments demonstrate that the OCRE domain directly binds to the proline-rich C-terminal tail of the essential snRNP core proteins SmN/B/B'. The NMR structure of an OCRE-SmN peptide complex reveals a specific recognition of poly-proline helical motifs in SmN/B/B'. Mutation of conserved aromatic residues impairs binding to the Sm proteins in vitro and compromises RBM5-mediated alternative splicing regulation of FAS/CD95. Thus, RBM5 OCRE represents a poly-proline recognition domain that mediates critical interactions with the C-terminal tail of the spliceosomal SmN/B/B' proteins in FAS/CD95 alternative splicing regulation. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
181. NMR in structural genomics to increase structural coverage of the protein universe.
- Author
-
Serrano, Pedro, Dutta, Samit K., Proudfoot, Andrew, Mohanty, Biswaranjan, Susac, Lukas, Martin, Bryan, Geralt, Michael, Jaroszewski, Lukasz, Godzik, Adam, Elsliger, Marc, Wilson, Ian A., and Wüthrich, Kurt
- Subjects
- *
NUCLEAR magnetic resonance , *STRUCTURAL genomics , *PROTEIN structure , *X-ray crystallography , *BIOLOGICAL research , *HOMOLOGY (Biology) - Abstract
For more than a decade, the Joint Center for Structural Genomics ( JCSG; ) worked toward increased three-dimensional structure coverage of the protein universe. This coordinated quest was one of the main goals of the four high-throughput ( HT) structure determination centers of the Protein Structure Initiative ( PSI; ). To achieve the goals of the PSI, the JCSG made use of the complementarity of structure determination by X-ray crystallography and nuclear magnetic resonance ( NMR) spectroscopy to increase and diversify the range of targets entering the HT structure determination pipeline. The overall strategy, for both techniques, was to determine atomic resolution structures for representatives of large protein families, as defined by the Pfam database, which had no structural coverage and could make significant contributions to biological and biomedical research. Furthermore, the experimental structures could be leveraged by homology modeling to further expand the structural coverage of the protein universe and increase biological insights. Here, we describe what could be achieved by this structural genomics approach, using as an illustration the contributions from 20 NMR structure determinations out of a total of 98 JCSG NMR structures, which were selected because they are the first three-dimensional structure representations of the respective Pfam protein families. The information from this small sample is representative for the overall results from crystal and NMR structure determination in the JCSG. There are five new folds, which were classified as domains of unknown functions ( DUF), three of the proteins could be functionally annotated based on three-dimensional structure similarity with previously characterized proteins, and 12 proteins showed only limited similarity with previous deposits in the Protein Data Bank ( PDB) and were classified as DUFs. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
182. Analysis of Gene Expression in an Inbred Line of Soft-Shell Clams (Mya arenaria) Displaying Growth Heterosis: Regulation of Structural Genes and the NOD2 Pathway.
- Author
-
Wilson, John J., Grendler, Janelle, Dunlap-Smith, Azaline, Beal, Brian F., and Page, Shallee T.
- Subjects
- *
MYA arenaria , *HETEROSIS , *GENETIC regulation , *GENE expression , *STRUCTURAL genomics - Published
- 2016
- Full Text
- View/download PDF
183. Accumulation of transposable elements in Hox gene clusters during adaptive radiation of Anolis lizards.
- Author
-
Feiner, Nathalie
- Subjects
- *
HOMEOBOX genes , *ANOLES , *TRANSPOSONS , *BIOACCUMULATION , *LIZARD morphology , *STRUCTURAL genomics , *ECOLOGICAL niche - Abstract
Transposable elements (TEs) are DNA sequences that can insert elsewhere in the genome and modify genome structure and gene regulation. The role of TEs in evolution is contentious. One hypothesis posits that TE activity generates genomic incompatibilities that can cause reproductive isolation between incipient species. This predicts that TEs will accumulate during speciation events. Here, I tested the prediction that extant lineages with a relatively high rate of speciation have a high number of TEs in their genomes. I sequenced and analysed the TE content of a marker genomic region (Hox clusters) in Anolis lizards, a classic case of an adaptive radiation. Unlike other vertebrates, including closely related lizards, Anolis lizards have high numbers of TEs in their Hox clusters, genomic regions that regulate development of the morphological adaptations that characterize habitat specialists in these lizards. Following a burst of TE activity in the lineage leading to extant Anolis, TEs have continued to accumulate during or after speciation events, resulting in a positive relationship between TE density and lineage speciation rate. These results are consistent with the prediction that TE activity contributes to adaptive radiation by promoting speciation. Although there was no evidence that TE density per se is associated with ecological morphology, the activity of TEs in Hox clusters could have been a rich source for phenotypic variation that may have facilitated the rapid parallel morphological adaptation to microhabitats seen in extant Anolis lizards. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
184. Selection of reliable reference genes for quantitative real-time PCR analysis in plum (Prunus salicina Lindl.) under different postharvest treatments.
- Author
-
You, Yaohua, Zhang, Lei, Li, Pengmin, Yang, Chengquan, and Ma, Fengwang
- Subjects
- *
GENE expression , *POLYMERASE chain reaction , *PRUNUS salicina , *REVERSE transcriptase polymerase chain reaction , *MOLECULAR biology , *STRUCTURAL genomics - Abstract
The reverse transcription quantitative real-time polymerase chain reaction (qRT-PCR) technique has become one of the most widely used and reliable methods in gene expression studies. Successful application of qRT-PCR requires the accurate quantification of relative transcript levels, which strongly depends on the expression stability of the reference genes used as internal controls for data normalization. Plums ( Prunus salicina Lindl.) are among the most numerous and commercial important fruit trees. In order to ensure the reliability of gene expression analyses using qRT-PCR in plum molecular biology research, 14 candidate reference genes were selected, and their relative expression levels were further measured by qRT-PCR using samples of plum peels obtained via different postharvest processes. Three statistical algorithms, geNorm, NormFinder, and BestKeeper, were employed to assess the expression stability of each candidate gene. A comprehensive evaluation was generated by the overall analysis approach, RefFinder to infer the final rankings. The results showed that CAC was the most stably expressed candidate reference gene across all experimental conditions. CAC and UNK under the room temperature treatment, CAC , ACT , and CLATH under the cold treatment, and CAC and ACT under all treatments were suitable for accurate gene expression quantification. In addition, relative gene expression patterns of the plant anthocyanin biosynthesis-related structural gene PsANS were evaluated using selected housekeeping genes as internal controls under two treatments to further confirm the usefulness of the selected reference genes. These results indicated that the selection of systematically validated reference genes for specific experimental conditions is necessary to avoid misinterpretation of qRT-PCR data and to obtain accurate and reliable gene expression results. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
185. A gene cluster for the biosynthesis of moenomycin family antibiotics in the genome of teicoplanin producer Actinoplanes teichomyceticus.
- Author
-
Horbal, Liliya, Ostash, Bohdan, Luzhetskyy, Andriy, Walker, Suzanne, Kalinowski, Jorn, and Fedorenko, Victor
- Subjects
- *
ACTINOPLANES , *BACTERIAL genomes , *TEICOPLANIN , *ANTIBIOTIC synthesis , *BIOSYNTHESIS , *STRUCTURAL genomics - Abstract
Moenomycins are phosphoglycolipid antibiotics notable for their extreme potency, unique mode of action, and proven record of use in animal nutrition without selection for resistant microflora. There is a keen interest in manipulation of structures of moenomycins in order to better understand their structure-activity relationships and to generate improved analogs. Only two almost identical moenomycin biosynthetic gene clusters are known, limiting our knowledge of the evolution of moenomycin pathways and our ability to genetically diversify them. Here, we report a novel gene cluster ( tchm) that directs production of the phosphoglycolipid teichomycin in Actinoplanes teichomyceticus. Its overall genetic architecture is significantly different from that of the moenomycin biosynthesis ( moe) gene clusters of Streptomyces ghanaensis and Streptomyces clavuligerus, featuring multiple gene rearrangements and two novel structural genes. Involvement of the tchm cluster in teichomycin biosynthesis was confirmed via heterologous co-expression of amidotransferase tchmH5 and moe genes. Our work sets the background for further engineering of moenomycins and for deeper inquiries into the evolution of this fascinating biosynthetic pathway. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
186. DockQ: A Quality Measure for Protein-Protein Docking Models.
- Author
-
Basu, Sankar and Wallner, Björn
- Subjects
- *
PROTEIN-protein interactions , *MOLECULAR docking , *PROTEIN structure , *PROTEIN domains , *MACHINE learning - Abstract
The state-of-the-art to assess the structural quality of docking models is currently based on three related yet independent quality measures: Fnat, LRMS, and iRMS as proposed and standardized by CAPRI. These quality measures quantify different aspects of the quality of a particular docking model and need to be viewed together to reveal the true quality, e.g. a model with relatively poor LRMS (>10Å) might still qualify as 'acceptable' with a descent Fnat (>0.50) and iRMS (<3.0Å). This is also the reason why the so called CAPRI criteria for assessing the quality of docking models is defined by applying various ad-hoc cutoffs on these measures to classify a docking model into the four classes: Incorrect, Acceptable, Medium, or High quality. This classification has been useful in CAPRI, but since models are grouped in only four bins it is also rather limiting, making it difficult to rank models, correlate with scoring functions or use it as target function in machine learning algorithms. Here, we present DockQ, a continuous protein-protein docking model quality measure derived by combining Fnat, LRMS, and iRMS to a single score in the range [0, 1] that can be used to assess the quality of protein docking models. By using DockQ on CAPRI models it is possible to almost completely reproduce the original CAPRI classification into Incorrect, Acceptable, Medium and High quality. An average PPV of 94% at 90% Recall demonstrating that there is no need to apply predefined ad-hoc cutoffs to classify docking models. Since DockQ recapitulates the CAPRI classification almost perfectly, it can be viewed as a higher resolution version of the CAPRI classification, making it possible to estimate model quality in a more quantitative way using Z-scores or sum of top ranked models, which has been so valuable for the CASP community. The possibility to directly correlate a quality measure to a scoring function has been crucial for the development of scoring functions for protein structure prediction, and DockQ should be useful in a similar development in the protein docking field. DockQ is available at [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
187. Additive methods for genomic signatures.
- Author
-
Karamichalis, Rallis, Kari, Lila, Konstantinidis, Stavros, Kopecki, Steffen, and Solis-Reyes, Stephen
- Subjects
- *
COMPARATIVE genomics , *STRUCTURAL genomics , *DNA structure , *CHAOS synchronization , *NUCLEOTIDE sequence , *NUCLEOTIDE sequencing - Abstract
Background: Studies exploring the potential of Chaos Game Representations (CGR) of genomic sequences to act as "genomic signatures" (to be species- and genome-specific) showed that CGR patterns of nuclear and organellar DNA sequences of the same organism can be very different. While the hypothesis that CGRs of mitochondrial DNA sequences can act as genomic signatures was validated for a snapshot of all sequenced mitochondrial genomes available in the NCBI GenBank sequence database, to our knowledge no such extensive analysis of CGRs of nuclear DNA sequences exists to date. Results: We analyzed an extensive dataset, totalling 1.45 gigabase pairs, of nuclear/nucleoid genomic sequences (nDNA) from 42 different organisms, spanning all major kingdoms of life. Our computational experiments indicate that CGR signatures of nDNA of two different origins cannot always be differentiated, especially if they originate from closely-related species such as H. sapiens and P. troglodytes or E. coli and E. fergusonii. To address this issue, we propose the general concept of additive DNA signature of a set (collection) of DNA sequences. One particular instance, the composite DNA signature, combines information from nDNA fragments and organellar (mitochondrial, chloroplast, or plasmid) genomes. We demonstrate that, in this dataset, composite DNA signatures originating from two different organisms can be differentiated in all cases, including those where the use of CGR signatures of nDNA failed or was inconclusive. Another instance, the assembled DNA signature, combines information from many short DNA subfragments (e.g., 100 basepairs) of a given DNA fragment, to produce its signature. We show that an assembled DNA signature has the same distinguishing power as a conventionally computed CGR signature, while using shorter contiguous sequences and potentially less sequence information. Conclusions: Our results suggest that, while CGR signatures of nDNA cannot always play the role of genomic signatures, composite and assembled DNA signatures (separately or in combination) could potentially be used instead. Such additive signatures could be used, e.g., with raw unassembled next-generation sequencing (NGS) read data, when high-quality sequencing data is not available, or to complement information obtained by other methods of species identification or classification. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
188. Whole-Genome Sequencing Analysis of Serially Isolated Multi-Drug and Extensively Drug Resistant Mycobacterium tuberculosis from Thai Patients.
- Author
-
Faksri, Kiatichai, Tan, Jun Hao, Disratthakit, Areeya, Xia, Eryu, Prammananan, Therdsak, Suriyaphol, Prapat, Khor, Chiea Chuen, Teo, Yik-Ying, Ong, Rick Twee-Hee, and Chaiprasert, Angkana
- Subjects
- *
TUBERCULOSIS treatment , *DRUG resistance in microorganisms , *GENETIC markers , *PUBLIC health , *NUCLEOTIDE sequencing , *MYCOBACTERIUM tuberculosis - Abstract
Multi-drug and extensively drug-resistant tuberculosis (MDR and XDR-TB) are problems that threaten public health worldwide. Only some genetic markers associated with drug-resistant TB are known. Whole-genome sequencing (WGS) is a promising tool for distinguishing between re-infection and persistent infection in isolates taken at different times from a single patient, but has not yet been applied in MDR and XDR-TB. We aim to detect genetic markers associated with drug resistance and distinguish between reinfection and persistent infection from MDR and XDR-TB patients based on WGS analysis. Samples of Mycobacterium tuberculosis (n = 7), serially isolated from 2 MDR cases and 1 XDR-TB case, were retrieved from Siriraj Hospital, Bangkok. The WGS analysis used an Illumina Miseq sequencer. In cases of persistent infection, MDR-TB isolates differed at an average of 2 SNPs across the span of 2–9 months whereas in the case of reinfection, isolates differed at 61 SNPs across 2 years. Known genetic markers associated with resistance were detected from strains susceptible to streptomycin (2/7 isolates), p-aminosalicylic acid (3/7 isolates) and fluoroquinolone drugs. Among fluoroquinolone drugs, ofloxacin had the highest phenotype-genotype concordance (6/7 isolates), whereas gatifloxcain had the lowest (3/7 isolates). A putative candidate SNP in Rv2477c associated with kanamycin and amikacin resistance was suggested for further validation. WGS provided comprehensive results regarding molecular epidemiology, distinguishing between persistent infection and reinfection in M/XDR-TB and potentially can be used for detection of novel mutations associated with drug resistance. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
189. The Genomic Scrapheap Challenge; Extracting Relevant Data from Unmapped Whole Genome Sequencing Reads, Including Strain Specific Genomic Segments, in Rats.
- Author
-
van der Weide, Robin H., Simonis, Marieke, Hermsen, Roel, Toonen, Pim, Cuppen, Edwin, and de Ligt, Joep
- Subjects
- *
GENOMICS , *NUCLEOTIDE sequencing , *PHYLOGENY , *GENE libraries , *GENOTYPE-environment interaction , *LABORATORY rats - Abstract
Unmapped next-generation sequencing reads are typically ignored while they contain biologically relevant information. We systematically analyzed unmapped reads from whole genome sequencing of 33 inbred rat strains. High quality reads were selected and enriched for biologically relevant sequences; similarity-based analysis revealed clustering similar to previously reported phylogenetic trees. Our results demonstrate that on average 20% of all unmapped reads harbor sequences that can be used to improve reference genomes and generate hypotheses on potential genotype-phenotype relationships. Analysis pipelines would benefit from incorporating the described methods and reference genomes would benefit from inclusion of the genomic segments obtained through these efforts. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
190. Molecular Characterization of Ethylene-Regulated Anthocyanin Biosynthesis in Plums During Fruit Ripening.
- Author
-
Cheng, Yudou, Liu, Liqin, Yuan, Can, and Guan, Junfeng
- Subjects
- *
ANTHOCYANIN genetics , *BIOSYNTHESIS , *FRUIT ripening , *GENE expression , *ETHYLENE , *1-Methylcyclopropene , *STRUCTURAL genomics , *POSTHARVEST technology of crops - Abstract
Anthocyanin accumulation is an important physiological process that occurs during plum fruit ripening. Currently, little is known about the molecular mechanisms of ethylene-regulated anthocyanin accumulation in plum fruit. To better understand this process, ethylene production, anthocyanin content, and the expression of genes involved in anthocyanin biosynthesis and ethylene signaling were studied in the postharvest 'Oishi-wase' plum ( Prunus salicina Lindl. cv. 'Oishi-wase') fruit. Ethylene treatment significantly enhanced the anthocyanin accumulation in plum fruit peel, while 1-methylcyclopropene (1-MCP) treatment resulted in considerable reduction in anthocyanin content. Furthermore, ethylene treatment significantly enhanced the expression levels of the seven structural genes, i.e., PsPAL, PsCHS, PsCHI, PsF3H, PsDFR, PsLDOX, and PsUFGT, that were involved in the anthocyanin biosynthetic pathway, while 1-MCP treatment showed an opposite effect. Similar to the structural genes, the master transcription factor PsMYB10 messenger RNA (mRNA) was also induced by ethylene and suppressed by 1-MCP, and this was positively correlated with ethylene production, anthocyanin accumulation, and the expression profile of the structural genes. The expression patterns of the ethylene signal pathway-associated genes, including two ethylene receptors ( PsERS1 and PsETR1) and seven ethylene-responsive factors ( PsERFs), were also positively correlated with that of PsMYB10 and most of the structural genes involved in the anthocyanin biosynthetic pathway. Further analysis indicated that PsERS1, PsETR1, PsERF1a, PsERF1b, PsERF2a, PsERF3a, and PsERF3b might be involved in the anthocyanin biosynthetic pathway of plum fruit. These results suggest that the ethylene signaling pathway plays an important role in regulating anthocyanin biosynthesis of postharvest plum fruit. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
191. The structural molecular biology network of the State of São Paulo, Brazil
- Author
-
João A.R.G. Barbosa, Luis E.S. Netto, Chuck S. Farah, Sergio Schenkman, and Rogério Meneghini
- Subjects
genômica estrutural ,cristalografia de proteínas ,ressonância nuclear magnética ,estrutura de proteínas ,structural genomics ,protein crystallography ,nuclear magnetic resonance ,protein structure ,Science - Abstract
This article describes the achievements of the Structural Molecular Biology Network (SMolBNet), a collaborative program of structural molecular biology, centered in the State of São Paulo, Brazil, and supported by São Paulo State Funding Agency (FAPESP). It gathers twenty scientific groups and is coordinated by the scientific staff of the Center of Structural Molecular Biology, at the National Laboratory of Synchrotron Light (LNLS), in Campinas. The SMolBNet program has been aimed at 1) solving the structure of proteins of interest related to the research projects of the groups. In some cases, the choice has been to select proteins of unknown function or of possible novel structure obtained from the sequenced genomes of the FAPESP genomic program; 2) providing the groups with training in all the steps of the protein structure determination: gene cloning, protein expression, protein purification, protein crystallization and structure determination. Having begun in 2001, the program has been successful in both aims. Here, four groups reveal their participation in the program and describe the structural aspects of the proteins they have selected to study.Esse artigo descreve realizações do Programa SMolBNet (Rede de Biologia Molecular Estrutural) do Estado de São Paulo, apoiado pela FAPESP (Fundação de Apoio à Pesquisa do Estado de São Paulo). Ele reúne vinte grupos de pesquisa e é coordenado pelos pesquisadores do Laboratório Nacional de Luz Síncrotron (LNLS), em Campinas. O Programa SMolBNet tem como metas: Elucidar a estrutura tridimensional de proteínas de interesse aos grupos de pesquisa componentes do Programa; Prover os grupos com treinamento em todas as etapas de determinação de estrutura: clonagem gênica, expressão de proteínas, purificação de proteínas, cristalização de proteínas e elucidação de suas estruturas. Tendo começado em 2001, o Programa alcançou sucesso em ambas as metas. Neste artigo, quatro dos grupos descrevem suas participações, e discutem aspectos estruturais das proteínas que eles selecionaram para estudos.
- Published
- 2006
- Full Text
- View/download PDF
192. Genome-Wide Prediction and Analysis of 3D-Domain Swapped Proteins in the Human Genome from Sequence Information.
- Author
-
Upadhyay, Atul Kumar and Sowdhamini, Ramanathan
- Subjects
- *
PROTEINS , *GENOMES , *NUCLEOTIDE sequence , *AMYLOIDOSIS , *GENE ontology - Abstract
3D-domain swapping is one of the mechanisms of protein oligomerization and the proteins exhibiting this phenomenon have many biological functions. These proteins, which undergo domain swapping, have acquired much attention owing to their involvement in human diseases, such as conformational diseases, amyloidosis, serpinopathies, proteionopathies etc. Early realisation of proteins in the whole human genome that retain tendency to domain swap will enable many aspects of disease control management. Predictive models were developed by using machine learning approaches with an average accuracy of 78% (85.6% of sensitivity, 87.5% of specificity and an MCC value of 0.72) to predict putative domain swapping in protein sequences. These models were applied to many complete genomes with special emphasis on the human genome. Nearly 44% of the protein sequences in the human genome were predicted positive for domain swapping. Enrichment analysis was performed on the positively predicted sequences from human genome for their domain distribution, disease association and functional importance based on Gene Ontology (GO). Enrichment analysis was also performed to infer a better understanding of the functional importance of these sequences. Finally, we developed hinge region prediction, in the given putative domain swapped sequence, by using important physicochemical properties of amino acids. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
193. GNG Motifs Can Replace a GGG Stretch during G-Quadruplex Formation in a Context Dependent Manner.
- Author
-
Das, Kohal, Srivastava, Mrinal, and Raghavan, Sathees C.
- Subjects
- *
QUADRUPLEX nucleic acids , *DNA structure , *GUANINE , *MUTAGENESIS , *INTRAMOLECULAR catalysis - Abstract
G-quadruplexes are one of the most commonly studied non-B DNA structures. Generally, these structures are formed using a minimum of 4, three guanine tracts, with connecting loops ranging from one to seven. Recent studies have reported deviation from this general convention. One such deviation is the involvement of bulges in the guanine tracts. In this study, guanines along with bulges, also referred to as GNG motifs have been extensively studied using recently reported HOX11 breakpoint fragile region I as a model template. By strategic mutagenesis approach we show that the contribution from continuous G-tracts may be dispensible during G-quadruplex formation when such motifs are flanked by GNGs. Importantly, the positioning and number of GNG/GNGNG can also influence the formation of G-quadruplexes. Further, we assessed three genomic regions from HIF1 alpha, VEGF and SHOX gene for G-quadruplex formation using GNG motifs. We show that HIF1 alpha sequence harbouring GNG motifs can fold into intramolecular G-quadruplex. In contrast, GNG motifs in mutant VEGF sequence could not participate in structure formation, suggesting that the usage of GNG is context dependent. Importantly, we show that when two continuous stretches of guanines are flanked by two independent GNG motifs in a naturally occurring sequence (SHOX), it can fold into an intramolecular G-quadruplex. Finally, we show the specific binding of G-quadruplex binding protein, Nucleolin and G-quadruplex antibody, BG4 to SHOX G-quadruplex. Overall, our study provides novel insights into the role of GNG motifs in G-quadruplex structure formation which may have both physiological and pathological implications. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
194. A Cytolethal Distending Toxin Variant from Aggregatibacter actinomycetemcomitans with an Aberrant CdtB That Lacks the Conserved Catalytic Histidine 160.
- Author
-
Obradović, Davor, Gašperšič, Rok, Caserman, Simon, Leonardi, Adrijana, Jamnik, Maja, Podlesek, Zdravko, Seme, Katja, Anderluh, Gregor, Križaj, Igor, Maček, Peter, and Butala, Matej
- Subjects
- *
ACTINOBACILLUS actinomycetemcomitans , *HISTIDINE , *EUKARYOTIC cells , *NUCLEAR DNA , *PERIODONTITIS - Abstract
The periodontopathogen Aggregatibacter actinomycetemcomitans synthesizes several virulence factors, including cytolethal distending toxin (CDT). The active CDT holoenzyme is an AB-type tripartite genotoxin that affects eukaryotic cells. Subunits CdtA and CdtC (B-components) allow binding and intracellular translocation of the active CdtB (A-component), which elicits nuclear DNA damage. Different strains of A. actinomycetemcomitans have diverse virulence genotypes, which results in varied pathogenic potential and disease progression. Here, we identified an A. actinomycetemcomitans strain isolated from two patients with advance chronic periodontitis that has a regular cdtABC operon, which, however, codes for a unique, shorter, variant of the CdtB subunit. We describe the characteristics of this CdtBΔ116–188, which lacks the intact nuclear localisation signal and the catalytic histidine 160. We show that the A. actinomycetemcomitans DO15 isolate secretes CdtBΔ116–188, and that this subunit cannot form a holotoxin and is also not genotoxic if expressed ectopically in HeLa cells. Furthermore, the A. actinomycetemcomitans DO15 isolate is not toxic, nor does it induce cellular distention upon infection of co-cultivated HeLa cells. Biological significance of this deletion in the cdtB remains to be explained. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
195. LHP1 Regulates H3K27me3 Spreading and Shapes the Three-Dimensional Conformation of the Arabidopsis Genome.
- Author
-
Veluchamy, Alaguraj, Jégu, Teddy, Ariel, Federico, Latrasse, David, Mariappan, Kiruthiga Gayathri, Kim, Soon-Kap, Crespi, Martin, Hirt, Heribert, Bergounioux, Catherine, Raynaud, Cécile, and Benhamed, Moussa
- Subjects
- *
HISTONE methylation , *CHROMATIN , *ARABIDOPSIS thaliana genetics , *GENETIC regulation in plants , *MOLECULAR conformation , *NON-coding RNA - Abstract
Precise expression patterns of genes in time and space are essential for proper development of multicellular organisms. Dynamic chromatin conformation and spatial organization of the genome constitute a major step in this regulation to modulate developmental outputs. Polycomb repressive complexes (PRCs) mediate stable or flexible gene repression in response to internal and environmental cues. In Arabidopsis thaliana, LHP1 co-localizes with H3K27me3 epigenetic marks throughout the genome and interacts with PRC1 and PRC2 members as well as with a long noncoding RNA. Here, we show that LHP1 is responsible for the spreading of H3K27me3 towards the 3’ end of the gene body. We also identified a subset of LHP1-activated genes and demonstrated that LHP1 shapes local chromatin topology in order to control transcriptional co-regulation. Our work reveals a general role of LHP1 from local to higher conformation levels of chromatin configuration to determine its accessibility to define gene expression patterns. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
196. Advances in computational approaches for prioritizing driver mutations and significantly mutated genes in cancer genomes.
- Author
-
Feixiong Cheng, Junfei Zhao, and Zhongming Zhao
- Subjects
- *
CANCER genes , *GENETIC mutation , *STRUCTURAL genomics , *GENE fusion , *NUCLEOTIDE sequencing , *CANCER genetics - Abstract
Cancer is often driven by the accumulation of genetic alterations, including single nucleotide variants, small insertions or deletions, gene fusions, copy-number variations, and large chromosomal rearrangements. Recent advances in nextgeneration sequencing technologies have helped investigators generate massive amounts of cancer genomic data and catalog somatic mutations in both common and rare cancer types. So far, the somatic mutation landscapes and signatures of >10 major cancer types have been reported; however, pinpointing driver mutations and cancer genes from millions of available cancer somatic mutations remains a monumental challenge. To tackle this important task, many methods and computational tools have been developed during the past several years and, thus, a review of its advances is urgently needed. Here, we first summarize the main features of these methods and tools for whole-exome, whole-genome and wholetranscriptome sequencing data. Then, we discuss major challenges like tumor intra-heterogeneity, tumor sample saturation and functionality of synonymous mutations in cancer, all of which may result in false-positive discoveries. Finally, we highlight new directions in studying regulatory roles of noncoding somatic mutations and quantitatively measuring circulating tumor DNA in cancer. This review may help investigators find an appropriate tool for detecting potential driver or actionable mutations in rapidly emerging precision cancer medicine. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
197. Structural organization of fatty acid desaturase loci in linseed lines with contrasting linolenic acid contents.
- Author
-
Thambugala, Dinushika, Ragupathy, Raja, and Cloutier, Sylvie
- Subjects
- *
FATTY acid desaturase , *FLAXSEED , *LINOLENIC acids , *DIPLOIDY , *AGRONOMY , *BACTERIAL artificial chromosomes , *PLANT chromosomes - Abstract
Flax ( Linum usitatissimum L.), the richest crop source of omega-3 fatty acids (FAs), is a diploid plant with an estimated genome size of ~370 Mb and is well suited for studying genomic organization of agronomically important traits. In this study, 12 bacterial artificial chromosome clones harbouring the six FA desaturase loci sad1, sad2, fad2a, fad2b, fad3a and fad3b from the conventional variety CDC Bethune and the high linolenic acid line M5791 were sequenced, analysed and compared to determine the structural organization of these loci and to gain insights into the genetic mechanisms underlying FA composition in flax. With one gene every 3.2-4.6 kb, the desaturase loci have a higher gene density than the genome's average of one gene per 7.8-8.2 kb. The gene order and orientation across the two genotypes were generally conserved with the exception of the sad1 locus that was predicted to have additional genes in CDC Bethune. High sequence conservation in both genic and intergenic regions of the sad and fad2b loci contrasted with the significant level of variation of the fad2a and fad3 loci, with SNPs being the most frequently observed mutation type. The fad2a locus had 297 SNPs and 36 indels over ~95 kb contrasting with the fad2b locus that had a mere seven SNPs and four indels in ~110 kb. Annotation of the gene-rich loci revealed other genes of known role in lipid or carbohydrate metabolic/catabolic pathways. The organization of the fad2b locus was particularly complex with seven copies of the fad2b gene in both genotypes. The presence of Gypsy, Copia, MITE, Mutator, hAT and other novel repeat elements at the desaturase loci was similar to that of the whole genome. This structural genomic analysis provided some insights into the genomic organization and composition of the main desaturase loci of linseed and of their complex evolution through both tandem and whole genome duplications. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
198. Classification of proteins with shared motifs and internal repeats in the ECOD database.
- Author
-
Schaeffer, R. Dustin, Kinch, Lisa N., Liao, Yuxing, and Grishin, Nick V.
- Abstract
Proteins and their domains evolve by a set of events commonly including the duplication and divergence of small motifs. The presence of short repetitive regions in domains has generally constituted a difficult case for structural domain classifications and their hierarchies. We developed the Evolutionary Classification Of protein Domains (ECOD) in part to implement a new schema for the classification of these types of proteins. Here we document the ways in which ECOD classifies proteins with small internal repeats, widespread functional motifs, and assemblies of small domain-like fragments in its evolutionary schema. We illustrate the ways in which the structural genomics project impacted the classification and characterization of new structural domains and sequence families over the decade. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
199. QuIN: A Web Server for Querying and Visualizing Chromatin Interaction Networks.
- Author
-
Thibodeau, Asa, Márquez, Eladio J., Luo, Oscar, Ruan, Yijun, Menghi, Francesca, Shin, Dong-Guk, Stitzel, Michael L., Vera-Licona, Paola, and Ucar, Duygu
- Subjects
- *
CHROMATIN , *HUMAN genome , *HUMAN chromosomes , *QUERYING (Computer science) , *GENOMICS - Abstract
Recent studies of the human genome have indicated that regulatory elements (e.g. promoters and enhancers) at distal genomic locations can interact with each other via chromatin folding and affect gene expression levels. Genomic technologies for mapping interactions between DNA regions, e.g., ChIA-PET and HiC, can generate genome-wide maps of interactions between regulatory elements. These interaction datasets are important resources to infer distal gene targets of non-coding regulatory elements and to facilitate prioritization of critical loci for important cellular functions. With the increasing diversity and complexity of genomic information and public ontologies, making sense of these datasets demands integrative and easy-to-use software tools. Moreover, network representation of chromatin interaction maps enables effective data visualization, integration, and mining. Currently, there is no software that can take full advantage of network theory approaches for the analysis of chromatin interaction datasets. To fill this gap, we developed a web-based application, QuIN, which enables: 1) building and visualizing chromatin interaction networks, 2) annotating networks with user-provided private and publicly available functional genomics and interaction datasets, 3) querying network components based on gene name or chromosome location, and 4) utilizing network based measures to identify and prioritize critical regulatory targets and their direct and indirect interactions. AVAILABILITY: QuIN’s web server is available at QuIN is developed in Java and JavaScript, utilizing an Apache Tomcat web server and MySQL database and the source code is available under the GPLV3 license available on GitHub: . [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
200. Mating structures for genomic selection breeding programs in aquaculture.
- Author
-
Sonesson, Anna K. and Ødegård, Jørgen
- Subjects
BREEDING ,AQUACULTURE ,STRUCTURAL genomics ,ANIMAL sexual behavior ,CHECK safekeeping - Abstract
Background: In traditional family-based aquaculture breeding, each sire is mated to two dams in order to separate the sire's genetic effect from other family effects. Factorial mating involves more mates per sire and/or dam and result in more but smaller full- and/or half-sib families. For traits measured on sibs of selection candidates, factorial mating increases intensity of selection between families when selection is on traditional best linear unbiased prediction (BLUP) estimated breeding values (TRAD-EBV). However, selection on genome-wide estimated breeding values (GW-EBV), uses both within- and between-family effects and the advantage of factorial mating is less obvious. Our aim was to compare by computer simulation the impact of various factorial mating strategies for truncation selection on TRAD-EBV versus GW-EBV on rates of inbreeding, accuracy of selection and genetic gain for two traits, i.e. one measured on selection candidates (CAND-TRAIT) and one on their sibs (SIB-TRAIT). Results: Sire:dam mating ratios of 1:1, 2:2 or 10:10 were tested with 100, 200 or 1000 families produced from a constant number of parents (100 sires and 100 dams), and a mating ratio of 1:2 with 200 families produced from 100 sires and 200 dams. With GW-EBV, changing the mating ratio from 1:1 to 10:10 had a limited effect on genetic gain (less than 5 %) for both CAND-TRAIT and SIB-TRAIT, whereas with TRAD-EBV, selection intensity increased for SIB-TRAIT and genetic gain increased by 41 and 77 % for schemes with 3000 and 12,000 selection candidates, respectively. For both GW-EBV and TRAD-EBV, rates of inbreeding decreased by up to ~30 % when the mating ratio was changed from 1:1 to 10:10 for schemes with 3000 to 12,000 selection candidates. Similar results were found for alternative heritabilities of SIB-TRAIT and total number of tested sibs. Conclusions: Changing the sire:dam mating ratio from 1:1 to 10:10 increased genetic gain substantially with TRADEBV, mainly through increased selection intensity for the SIB-TRAIT, whereas with GW-EBV, it had a limited effect on genetic gain for both traits. Rates of inbreeding decreased for both selection methods. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.