167 results on '"Mathews DH"'
Search Results
2. memerna: Sparse RNA Folding Including Coaxial Stacking.
- Author
-
Courtney E, Datta A, Mathews DH, and Ward M
- Abstract
Determining RNA secondary structure is a core problem in computational biology. Fast algorithms for predicting secondary structure are fundamental to this task.Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY, USA We describe a modified formulation of the Zuker-Stiegler algorithm with coaxial stacking, a stabilising interaction in which the ends of helices in multi-loops are stacked. In particular, optimal coaxial stacking is computed as part of the dynamic programming state, rather than in an inner loop. We introduce a new notion of sparsity, which we call replaceability. Replaceability is a more general condition and applicable in more places than the triangle inequality that is used by previous sparse folding methods. We also introduce non-monotonic candidate lists as an additional sparsification tool. Existing usages of the triangle inequality for sparsification can be thought of as an application of both replaceability and monotonicity together. The modified recurrences along with replaceability allows sparsification to be applied to coaxial stacking as well, which increases the speed of the algorithm. We implemented this algorithm in software we call memerna, which we show to have the fastest exact (non-heuristic) implementation of RNA folding under the complete Turner 2004 model with coaxial stacking, out of several popular RNA folding tools supporting coaxial stacking. We also introduce a new notation for secondary structure which includes coaxial stacking, terminal mismatches, and dangles (CTDs) information. The memerna package 0.1 release is available at https://github.com/Edgeworth/memerna/tree/release/0.1., Competing Interests: Declaration of Competing Interest The authors declared that there is no conflict of interest. The author is an Editorial Board Member/Editor-in-Chief/Associate Editor/Guest Editor for Journal of Molecular Biology and was not involved in the editorial review or the decision to publish this article., (Copyright © 2024 The Author(s). Published by Elsevier Ltd.. All rights reserved.)
- Published
- 2024
- Full Text
- View/download PDF
3. Two riboswitch classes that share a common ligand-binding fold show major differences in the ability to accommodate mutations.
- Author
-
Srivastava Y, Akinyemi O, Rohe TC, Pritchett EM, Baker CD, Sharma A, Jenkins JL, Mathews DH, and Wedekind JE
- Abstract
Riboswitches are structured RNAs that sense small molecules to control expression. Prequeuosine1 (preQ1)-sensing riboswitches comprise three classes (I, II and III) that adopt distinct folds. Despite this difference, class II and III riboswitches each use 10 identical nucleotides to bind the preQ1 metabolite. Previous class II studies showed high sensitivity to binding-pocket mutations, which reduced preQ1 affinity and impaired function. Here, we introduced four equivalent mutations into a class III riboswitch, which maintained remarkably tight preQ1 binding. Co-crystal structures of each class III mutant showed compensatory interactions that preserve the fold. Chemical modification analysis revealed localized RNA flexibility changes for each mutant, but molecular dynamics (MD) simulations suggested that each mutation was not overtly destabilizing. Although impaired, class III mutants retained tangible gene-regulatory activity in bacteria compared to equivalent preQ1-II variants; mutations in the preQ1-pocket floor were tolerated better than wall mutations. Principal component analysis of MD trajectories suggested that the most functionally deleterious wall mutation samples different motions compared to wildtype. Overall, the results reveal that formation of compensatory interactions depends on the context of mutations within the overall fold and that functionally deleterious mutations can alter long-range correlated motions that link the riboswitch binding pocket with distal gene-regulatory sequences., (© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2024
- Full Text
- View/download PDF
4. DecoyFinder: Identification of Contaminants in Sets of Homologous RNA Sequences.
- Author
-
Zhu M, Zuber J, Tan Z, Sharma G, and Mathews DH
- Abstract
Motivation: RNA structure is essential for the function of many non-coding RNAs. Using multiple homologous sequences, which share structure and function, secondary structure can be predicted with much higher accuracy than with a single sequence. It can be difficult, however, to establish a set of homologous sequences when their structure is not yet known. We developed a method to identify sequences in a set of putative homologs that are in fact non-homologs., Results: Previously, we developed TurboFold to estimate conserved structure using multiple, unaligned RNA homologs. Here, we report that the positive predictive value of TurboFold is significantly reduced by the presence of contamination by non-homologous sequences, although the reduction is less than 1%. We developed a method called DecoyFinder, which applies machine learning trained with features determined by TurboFold, to detect sequences that are not homologous with the other sequences in the set. This method can identify approximately 45% of non-homologous sequences, at a rate of 5% misidentification of true homologous sequences., Availability: DecoyFinder and TurboFold are incorporated in RNAstructure, which is provided for free and open source under the GPL V2 license. It can be downloaded at http://rna.urmc.rochester.edu/RNAstructure.html.
- Published
- 2024
- Full Text
- View/download PDF
5. Comprehensive Profiling of Roquin Binding Preferences for RNA Stem-Loops.
- Author
-
Oberstrass L, Tants JN, Lichtenthaeler C, Ali SE, Koch L, Mathews DH, Schlundt A, and Weigand JE
- Abstract
The cellular levels of mRNAs are controlled post-transcriptionally by cis-regulatory elements located in the 3'-untranslated region. These linear or structured elements are recognized by RNA-binding proteins (RBPs) to modulate mRNA stability. The Roquin-1 and -2 proteins specifically recognize RNA stem-loop motifs, the trinucleotide loop-containing constitutive decay elements (CDEs) and the hexanucleotide loop-containing alternative decay elements (ADEs), with their unique ROQ domain to initiate mRNA degradation. However, the RNA-binding capacity of Roquin towards different classes of stem-loops has not been rigorously characterized, leaving its exact binding preferences unclear. Here, we map the RNA-binding preference of the ROQ domain at nucleotide resolution introducing sRBNS (structured RNA Bind-n-Seq), a customized RBNS workflow with pre-structured RNA libraries. We found a clear preference of Roquin towards specific loop sizes and extended the consensus motifs for CDEs and ADEs. The newly identified motifs are recognized with nanomolar affinity through the canonical RNA-ROQ interface. Using these new stem-loop variants as blueprints, we predicted novel Roquin target mRNAs and verified the expanded target space in cells. The study demonstrates the power of high-throughput assays including RNA structure formation for the systematic investigation of (structural) RNA-binding preferences to comprehensively identify mRNA targets and elucidate the biological function of RBPs., (© 2024 The Authors. Angewandte Chemie International Edition published by Wiley-VCH GmbH.)
- Published
- 2024
- Full Text
- View/download PDF
6. NNDB: An Expanded Database of Nearest Neighbor Parameters for Predicting Stability of Nucleic Acid Secondary Structures.
- Author
-
Mittal A, Turner DH, and Mathews DH
- Subjects
- Thermodynamics, Databases, Nucleic Acid, Computational Biology methods, Nucleic Acid Conformation, RNA chemistry, DNA chemistry
- Abstract
Nearest neighbor thermodynamic parameters are widely used for RNA and DNA secondary structure prediction and to model thermodynamic ensembles of secondary structures. The Nearest Neighbor Database (NNDB) is a freely available web resource (https://rna.urmc.rochester.edu/NNDB) that provides the functional forms, parameter values, and example calculations. The NNDB provides the 1999 and 2004 set of RNA folding nearest neighbor parameters. We expanded the database to include a set of DNA parameters and a set of RNA parameters that includes m
6 A in addition to the canonical RNA nucleobases. The site was redesigned using the Quarto open-source publishing system. A downloadable PDF version of the complete resource and downloadable sets of nearest neighbor parameters are available., Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2024 Elsevier Ltd. All rights reserved.)- Published
- 2024
- Full Text
- View/download PDF
7. Computational Resources for Molecular Biology 2024.
- Author
-
Casadio R, Mathews DH, and Sternberg MJE
- Subjects
- Humans, Computational Biology methods, Molecular Biology methods
- Published
- 2024
- Full Text
- View/download PDF
8. LinearAlifold: Linear-time consensus structure prediction for RNA alignments.
- Author
-
Malik A, Zhang L, Gautam M, Dai N, Li S, Zhang H, Mathews DH, and Huang L
- Subjects
- Genome, Viral, Software, Computational Biology methods, Humans, Sequence Analysis, RNA methods, Algorithms, RNA chemistry, SARS-CoV-2 genetics, RNA, Viral genetics, RNA, Viral chemistry, Nucleic Acid Conformation, COVID-19 virology, Sequence Alignment methods
- Abstract
Predicting the consensus structure of a set of aligned RNA homologs is a convenient method to find conserved structures in an RNA genome, which has many applications including viral diagnostics and therapeutics. However, the most commonly used tool for this task, RNAalifold, is prohibitively slow for long sequences, due to a cubic scaling with the sequence length, taking over a day on 400 SARS-CoV-2 and SARS-related genomes (∼30,000nt). We present LinearAlifold, a much faster alternative that scales linearly with both the sequence length and the number of sequences, based on our work LinearFold that folds a single RNA in linear time. Our work is orders of magnitude faster than RNAalifold (0.7 h on the above 400 genomes, or ∼36× speedup) and achieves higher accuracies when compared to a database of known structures. More interestingly, LinearAlifold's prediction on SARS-CoV-2 correlates well with experimentally determined structures, substantially outperforming RNAalifold. Finally, LinearAlifold supports two energy models (Vienna and BL*) and four modes: minimum free energy (MFE), maximum expected accuracy (MEA), ThreshKnot, and stochastic sampling, each of which takes under an hour for hundreds of SARS-CoV variants. Our resource is at: https://github.com/LinearFold/LinearAlifold (code) and http://linearfold.org/linear-alifold (server)., Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2024. Published by Elsevier Ltd.)
- Published
- 2024
- Full Text
- View/download PDF
9. Sequence Design Using RNAstructure.
- Author
-
Zhu M and Mathews DH
- Subjects
- RNA Folding, Sequence Analysis, RNA methods, Algorithms, Software, Nucleic Acid Conformation, RNA chemistry, RNA genetics, Computational Biology methods
- Abstract
RNA is present in all domains of life. It was once thought to be solely involved in protein expression, but recent advances have revealed its crucial role in catalysis and gene regulation through noncoding RNA. With a growing interest in exploring RNAs with specific structures, there is an increasing focus on designing RNA structures for in vivo and in vitro experimentation and for therapeutics. The development of RNA secondary structure prediction methods has also spurred the growth of RNA design software. However, there are challenges to designing RNA sequences that meet secondary structure requirements. One major challenge is that the secondary structure design problem is likely NP-hard, making it computationally intensive. Another issue is that objective functions need to consider the folding ensemble of RNA molecules to avoid off target structures. In this chapter, we provide protocols for two software tools from the RNAstructure package: "Design" for structured RNA sequence design and "orega" for unstructured RNA sequence design., (© 2025. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.)
- Published
- 2025
- Full Text
- View/download PDF
10. Estimating RNA Secondary Structure Folding Free Energy Changes with efn2.
- Author
-
Zuber J and Mathews DH
- Subjects
- Computational Biology methods, Models, Molecular, RNA chemistry, RNA Folding, Software, Nucleic Acid Conformation, Thermodynamics
- Abstract
A number of analyses require estimates of the folding free energy changes of specific RNA secondary structures. These predictions are often based on a set of nearest neighbor parameters that models the folding stability of a RNA secondary structure as the sum of folding stabilities of the structural elements that comprise the secondary structure. In the software suite RNAstructure, the free energy change calculation is implemented in the program efn2. The efn2 program estimates the folding free energy change and the experimental uncertainty in the folding free energy change. It can be run through the graphical user interface for RNAstructure, from the command line, or a web server. This chapter provides detailed protocols for using efn2., (© 2024. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.)
- Published
- 2024
- Full Text
- View/download PDF
11. LinearCoFold and LinearCoPartition: linear-time algorithms for secondary structure prediction of interacting RNA molecules.
- Author
-
Zhang H, Li S, Dai N, Zhang L, Mathews DH, and Huang L
- Subjects
- Humans, Base Pairing, Genomics, Nucleic Acid Conformation, RNA, Viral chemistry, SARS-CoV-2 chemistry, Algorithms, RNA chemistry, RNA metabolism
- Abstract
Many RNAs function through RNA-RNA interactions. Fast and reliable RNA structure prediction with consideration of RNA-RNA interaction is useful, however, existing tools are either too simplistic or too slow. To address this issue, we present LinearCoFold, which approximates the complete minimum free energy structure of two strands in linear time, and LinearCoPartition, which approximates the cofolding partition function and base pairing probabilities in linear time. LinearCoFold and LinearCoPartition are orders of magnitude faster than RNAcofold. For example, on a sequence pair with combined length of 26,190 nt, LinearCoFold is 86.8× faster than RNAcofold MFE mode, and LinearCoPartition is 642.3× faster than RNAcofold partition function mode. Surprisingly, LinearCoFold and LinearCoPartition's predictions have higher PPV and sensitivity of intermolecular base pairs. Furthermore, we apply LinearCoFold to predict the RNA-RNA interaction between SARS-CoV-2 genomic RNA (gRNA) and human U4 small nuclear RNA (snRNA), which has been experimentally studied, and observe that LinearCoFold's prediction correlates better with the wet lab results than RNAcofold's., (© The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2023
- Full Text
- View/download PDF
12. Secondary structures that regulate mRNA translation provide insights for ASO-mediated modulation of cardiac hypertrophy.
- Author
-
Hedaya OM, Venkata Subbaiah KC, Jiang F, Xie LH, Wu J, Khor ES, Zhu M, Mathews DH, Proschel C, and Yao P
- Subjects
- Humans, Animals, Mice, Codon, Initiator genetics, 5' Untranslated Regions, RNA, Messenger genetics, Open Reading Frames genetics, Protein Biosynthesis, Cardiomegaly genetics
- Abstract
Translation of upstream open reading frames (uORFs) typically abrogates translation of main (m)ORFs. The molecular mechanism of uORF regulation in cells is not well understood. Here, we data-mined human and mouse heart ribosome profiling analyses and identified a double-stranded RNA (dsRNA) structure within the GATA4 uORF that cooperates with the start codon to augment uORF translation and inhibits mORF translation. A trans-acting RNA helicase DDX3X inhibits the GATA4 uORF-dsRNA activity and modulates the translational balance of uORF and mORF. Antisense oligonucleotides (ASOs) that disrupt this dsRNA structure promote mORF translation, while ASOs that base-pair immediately downstream (i.e., forming a bimolecular double-stranded region) of either the uORF or mORF start codon enhance uORF or mORF translation, respectively. Human cardiomyocytes and mice treated with a uORF-enhancing ASO showed reduced cardiac GATA4 protein levels and increased resistance to cardiomyocyte hypertrophy. We further show the broad utility of uORF-dsRNA- or mORF-targeting ASO to regulate mORF translation for other mRNAs. This work demonstrates that the uORF-dsRNA element regulates the translation of multiple mRNAs as a generalizable translational control mechanism. Moreover, we develop a valuable strategy to alter protein expression and cellular phenotypes by targeting or generating dsRNA downstream of a uORF or mORF start codon., (© 2023. Springer Nature Limited.)
- Published
- 2023
- Full Text
- View/download PDF
13. DNA Structure Design Is Improved Using an Artificially Expanded Alphabet of Base Pairs Including Loop and Mismatch Thermodynamic Parameters.
- Author
-
Pham TM, Miffin T, Sun H, Sharp KK, Wang X, Zhu M, Hoshika S, Peterson RJ, Benner SA, Kahn JD, and Mathews DH
- Subjects
- Base Pairing, Thermodynamics, Nucleotides, Algorithms, DNA
- Abstract
We show that in silico design of DNA secondary structures is improved by extending the base pairing alphabet beyond A-T and G-C to include the pair between 2-amino-8-(1'-β-d-2'-deoxyribofuranosyl)-imidazo-[1,2- a ]-1,3,5-triazin-(8 H )-4-one and 6-amino-3-(1'-β-d-2'-deoxyribofuranosyl)-5-nitro-(1 H )-pyridin-2-one, abbreviated as P and Z . To obtain the thermodynamic parameters needed to include P-Z pairs in the designs, we performed 47 optical melting experiments and combined the results with previous work to fit free energy and enthalpy nearest neighbor folding parameters for P-Z pairs and G-Z wobble pairs. We find G-Z pairs have stability comparable to that of A-T pairs and should therefore be included as base pairs in structure prediction and design algorithms. Additionally, we extrapolated the set of loop, terminal mismatch, and dangling end parameters to include the P and Z nucleotides. These parameters were incorporated into the RNAstructure software package for secondary structure prediction and analysis. Using the RNAstructure Design program, we solved 99 of the 100 design problems posed by Eterna using the ACGT alphabet or supplementing it with P-Z pairs. Extending the alphabet reduced the propensity of sequences to fold into off-target structures, as evaluated by the normalized ensemble defect (NED). The NED values were improved relative to those from the Eterna example solutions in 91 of 99 cases in which Eterna-player solutions were provided. P-Z-containing designs had average NED values of 0.040, significantly below the 0.074 of standard-DNA-only designs, and inclusion of the P-Z pairs decreased the time needed to converge on a design. This work provides a sample pipeline for inclusion of any expanded alphabet nucleotides into prediction and design workflows.
- Published
- 2023
- Full Text
- View/download PDF
14. Algorithm for optimized mRNA design improves stability and immunogenicity.
- Author
-
Zhang H, Zhang L, Lin A, Xu C, Li Z, Liu K, Liu B, Ma X, Zhao F, Jiang H, Chen C, Shen H, Li H, Mathews DH, Zhang Y, and Huang L
- Subjects
- Animals, Humans, Mice, Codon genetics, Half-Life, Herpesvirus 3, Human genetics, Herpesvirus 3, Human immunology, Algorithms, COVID-19 genetics, COVID-19 immunology, COVID-19 prevention & control, COVID-19 Vaccines chemistry, COVID-19 Vaccines genetics, COVID-19 Vaccines immunology, mRNA Vaccines chemistry, mRNA Vaccines genetics, mRNA Vaccines immunology, RNA Stability genetics, RNA Stability immunology, RNA, Messenger chemistry, RNA, Messenger genetics, RNA, Messenger immunology, RNA, Messenger metabolism, SARS-CoV-2 genetics, SARS-CoV-2 immunology
- Abstract
Messenger RNA (mRNA) vaccines are being used to combat the spread of COVID-19 (refs.
1-3 ), but they still exhibit critical limitations caused by mRNA instability and degradation, which are major obstacles for the storage, distribution and efficacy of the vaccine products4 . Increasing secondary structure lengthens mRNA half-life, which, together with optimal codons, improves protein expression5 . Therefore, a principled mRNA design algorithm must optimize both structural stability and codon usage. However, owing to synonymous codons, the mRNA design space is prohibitively large-for example, there are around 2.4 × 10632 candidate mRNA sequences for the SARS-CoV-2 spike protein. This poses insurmountable computational challenges. Here we provide a simple and unexpected solution using the classical concept of lattice parsing in computational linguistics, where finding the optimal mRNA sequence is analogous to identifying the most likely sentence among similar-sounding alternatives6 . Our algorithm LinearDesign finds an optimal mRNA design for the spike protein in just 11 minutes, and can concurrently optimize stability and codon usage. LinearDesign substantially improves mRNA half-life and protein expression, and profoundly increases antibody titre by up to 128 times in mice compared to the codon-optimization benchmark on mRNA vaccines for COVID-19 and varicella-zoster virus. This result reveals the great potential of principled mRNA design and enables the exploration of previously unreachable but highly stable and efficient designs. Our work is a timely tool for vaccines and other mRNA-based medicines encoding therapeutic proteins such as monoclonal antibodies and anti-cancer drugs7,8 ., (© 2023. The Author(s).)- Published
- 2023
- Full Text
- View/download PDF
15. Genome-Wide DNA Changes Acquired by Candida albicans Caspofungin-Adapted Mutants.
- Author
-
Zuber J, Sah SK, Mathews DH, and Rustchenko E
- Abstract
Drugs from the echinocandin (ECN) class are now recommended 'front-line' treatments of infections caused by a prevailing fungal pathogen, C. albicans . However, the increased use of ECNs is associated with a rising resistance to ECNs. As the acquisition of ECN resistance in C. albicans is viewed as a multistep evolution, determining factors that are associated with the decreased ECN susceptibility is of importance. We have recently identified two cohorts of genes that are either up- or downregulated in concert in order to control remodeling of cell wall, an organelle targeted by ECNs, in laboratory mutants with decreased ECN susceptibility. Here, we profiled the global DNA sequence of four of these adapted mutants in search of DNA changes that are associated with decreased ECN susceptibility. We find a limited number of 112 unique mutations representing two alternative mutational pathways. Approximately half of the mutations occurred as hotspots. Approximately half of mutations and hotspots were shared by ECN-adapted mutants despite the mutants arising as independent events and differing in some of their phenotypes, as well as in condition of chromosome 5. A total of 88 mutations are associated with 43 open reading frames (ORFs) and occurred inside of an ORF or within 1 kb of an ORF, predominantly as single-nucleotide substitution. Mutations occurred more often in the 5'-UTR than in the 3'-UTR by a 1.67:1 ratio. A total of 16 mutations mapped to eight genomic features that were not ORFs: Tca4-4 retrotransposon; Tca2-7 retrotransposon; lambda-4a long terminal repeat; mu-Ra long terminal repeat; MRS-7b Major Repeat Sequence; MRS-R Major Repeat Sequence; RB2-5a repeat sequence; and tL (CAA) leucine tRNA. Finally, eight mutations are not associated with any ORF or other genomic feature. Repeated occurrence of single-nucleotide substitutions in non-related drug-adapted mutants strongly indicates that these DNA changes are accompanying drug adaptation and could possibly influence ECN susceptibility, thus serving as factors facilitating evolution of ECN drug resistance due to classical mutations in FKS1 .
- Published
- 2023
- Full Text
- View/download PDF
16. Computational Resources for Molecular Biology 2023.
- Author
-
Mathews DH, Casadio R, and Sternberg MJE
- Subjects
- Molecular Biology, Computational Biology
- Published
- 2023
- Full Text
- View/download PDF
17. RNA Secondary Structure Analysis Using RNAstructure.
- Author
-
Ali SE, Mittal A, and Mathews DH
- Subjects
- Binding Sites, Probability, Protein Structure, Secondary, Oligonucleotides, RNA
- Abstract
RNAstructure is a user-friendly program for the prediction and analysis of RNA secondary structure. It is available as a web server, a program with a graphical user interface, or a set of command line tools. The programs are available for Microsoft Windows, macOS, or Linux. This article provides protocols for prediction of RNA secondary structure (using the web server, the graphical user interface, or the command line) and high-affinity oligonucleotide binding sites to a structured RNA target (using the graphical user interface). © 2023 Wiley Periodicals LLC. Basic Protocol 1: Predicting RNA secondary structure using the RNAstructure web server Alternate Protocol 1: Predicting secondary structure and base pair probabilities using the RNAstructure graphical user interface Alternate Protocol 2: Predicting secondary structure and base pair probabilities using the RNAstructure command line interface Basic Protocol 2: Predicting binding affinities of oligonucleotides complementary to an RNA target using OligoWalk., (© 2023 Wiley Periodicals LLC.)
- Published
- 2023
- Full Text
- View/download PDF
18. RNA design via structure-aware multifrontier ensemble optimization.
- Author
-
Zhou T, Dai N, Li S, Ward M, Mathews DH, and Huang L
- Subjects
- Databases, Factual, Mutation, RNA, Ribosomal, 16S, Algorithms, Benchmarking
- Abstract
Motivation: RNA design is the search for a sequence or set of sequences that will fold to desired structure, also known as the inverse problem of RNA folding. However, the sequences designed by existing algorithms often suffer from low ensemble stability, which worsens for long sequence design. Additionally, for many methods only a small number of sequences satisfying the MFE criterion can be found by each run of design. These drawbacks limit their use cases., Results: We propose an innovative optimization paradigm, SAMFEO, which optimizes ensemble objectives (equilibrium probability or ensemble defect) by iterative search and yields a very large number of successfully designed RNA sequences as byproducts. We develop a search method which leverages structure level and ensemble level information at different stages of the optimization: initialization, sampling, mutation, and updating. Our work, while being less complicated than others, is the first algorithm that is able to design thousands of RNA sequences for the puzzles from the Eterna100 benchmark. In addition, our algorithm solves the most Eterna100 puzzles among all the general optimization based methods in our study. The only baseline solving more puzzles than our work is dependent on handcrafted heuristics designed for a specific folding model. Surprisingly, our approach shows superiority on designing long sequences for structures adapted from the database of 16S Ribosomal RNAs., Availability and Implementation: Our source code and data used in this article is available at https://github.com/shanry/SAMFEO., (© The Author(s) 2023. Published by Oxford University Press.)
- Published
- 2023
- Full Text
- View/download PDF
19. Generation and Functional Analysis of Defective Viral Genomes during SARS-CoV-2 Infection.
- Author
-
Zhou T, Gilliam NJ, Li S, Spandau S, Osborn RM, Connor S, Anderson CS, Mariani TJ, Thakar J, Dewhurst S, Mathews DH, Huang L, and Sun Y
- Subjects
- Humans, RNA, Viral genetics, Cohort Studies, SARS-CoV-2 genetics, Genome, Viral, Antiviral Agents, COVID-19 genetics, RNA Viruses genetics
- Abstract
Defective viral genomes (DVGs) have been identified in many RNA viruses as a major factor influencing antiviral immune response and viral pathogenesis. However, the generation and function of DVGs in SARS-CoV-2 infection are less known. In this study, we elucidated DVG generation in SARS-CoV-2 and its relationship with host antiviral immune response. We observed DVGs ubiquitously from transcriptome sequencing (RNA-seq) data sets of in vitro infections and autopsy lung tissues of COVID-19 patients. Four genomic hot spots were identified for DVG recombination, and RNA secondary structures were suggested to mediate DVG formation. Functionally, bulk and single-cell RNA-seq analysis indicated the interferon (IFN) stimulation of SARS-CoV-2 DVGs. We further applied our criteria to the next-generation sequencing (NGS) data set from a published cohort study and observed a significantly higher amount and frequency of DVG in symptomatic patients than those in asymptomatic patients. Finally, we observed exceptionally diverse DVG populations in one immunosuppressive patient up to 140 days after the first positive test of COVID-19, suggesting for the first time an association between DVGs and persistent viral infections in SARS-CoV-2. Together, our findings strongly suggest a critical role of DVGs in modulating host IFN responses and symptom development, calling for further inquiry into the mechanisms of DVG generation and into how DVGs modulate host responses and infection outcome during SARS-CoV-2 infection. IMPORTANCE Defective viral genomes (DVGs) are generated ubiquitously in many RNA viruses, including SARS-CoV-2. Their interference activity to full-length viruses and IFN stimulation provide the potential for them to be used in novel antiviral therapies and vaccine development. SARS-CoV-2 DVGs are generated through the recombination of two discontinuous genomic fragments by viral polymerase complex, and this recombination is also one of the major mechanisms for the emergence of new coronaviruses. Focusing on the generation and function of SARS-CoV-2 DVGs, these studies identify new hot spots for nonhomologous recombination and strongly suggest that the secondary structures within viral genomes mediate the recombination. Furthermore, these studies provide the first evidence for IFN stimulation activity of de novo DVGs during natural SARS-CoV-2 infection. These findings set up the foundation for further mechanism studies of SARS-CoV-2 recombination and provide evidence to harness the immunostimulatory potential of DVGs in the development of a vaccine and antivirals for SARS-CoV-2., Competing Interests: The authors declare no conflict of interest.
- Published
- 2023
- Full Text
- View/download PDF
20. In vivo secondary structural analysis of Influenza A virus genomic RNA.
- Author
-
Mirska B, Woźniak T, Lorent D, Ruszkowska A, Peterson JM, Moss WN, Mathews DH, Kierzek R, and Kierzek E
- Subjects
- Humans, SARS-CoV-2 genetics, RNA, Viral genetics, Genomics, Influenza A Virus, H1N1 Subtype genetics, COVID-19, Influenza A virus genetics
- Abstract
Influenza A virus (IAV) is a respiratory virus that causes epidemics and pandemics. Knowledge of IAV RNA secondary structure in vivo is crucial for a better understanding of virus biology. Moreover, it is a fundament for the development of new RNA-targeting antivirals. Chemical RNA mapping using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) coupled with Mutational Profiling (MaP) allows for the thorough examination of secondary structures in low-abundance RNAs in their biological context. So far, the method has been used for analyzing the RNA secondary structures of several viruses including SARS-CoV-2 in virio and in cellulo. Here, we used SHAPE-MaP and dimethyl sulfate mutational profiling with sequencing (DMS-MaPseq) for genome-wide secondary structure analysis of viral RNA (vRNA) of the pandemic influenza A/California/04/2009 (H1N1) strain in both in virio and in cellulo environments. Experimental data allowed the prediction of the secondary structures of all eight vRNA segments in virio and, for the first time, the structures of vRNA5, 7, and 8 in cellulo. We conducted a comprehensive structural analysis of the proposed vRNA structures to reveal the motifs predicted with the highest accuracy. We also performed a base-pairs conservation analysis of the predicted vRNA structures and revealed many highly conserved vRNA motifs among the IAVs. The structural motifs presented herein are potential candidates for new IAV antiviral strategies., (© 2023. The Author(s).)
- Published
- 2023
- Full Text
- View/download PDF
21. A riboswitch separated from its ribosome-binding site still regulates translation.
- Author
-
Schroeder GM, Akinyemi O, Malik J, Focht CM, Pritchett EM, Baker CD, McSally JP, Jenkins JL, Mathews DH, and Wedekind JE
- Subjects
- Binding Sites, Gene Expression Regulation, Molecular Dynamics Simulation, Nucleic Acid Conformation, Ribosomes genetics, Ribosomes metabolism, Riboswitch, Protein Biosynthesis
- Abstract
Riboswitches regulate downstream gene expression by binding cellular metabolites. Regulation of translation initiation by riboswitches is posited to occur by metabolite-mediated sequestration of the Shine-Dalgarno sequence (SDS), causing bypass by the ribosome. Recently, we solved a co-crystal structure of a prequeuosine1-sensing riboswitch from Carnobacterium antarcticum that binds two metabolites in a single pocket. The structure revealed that the second nucleotide within the gene-regulatory SDS, G34, engages in a crystal contact, obscuring the molecular basis of gene regulation. Here, we report a co-crystal structure wherein C10 pairs with G34. However, molecular dynamics simulations reveal quick dissolution of the pair, which fails to reform. Functional and chemical probing assays inside live bacterial cells corroborate the dispensability of the C10-G34 pair in gene regulation, leading to the hypothesis that the compact pseudoknot fold is sufficient for translation attenuation. Remarkably, the C. antarcticum aptamer retained significant gene-regulatory activity when uncoupled from the SDS using unstructured spacers up to 10 nucleotides away from the riboswitch-akin to steric-blocking employed by sRNAs. Accordingly, our work reveals that the RNA fold regulates translation without SDS sequestration, expanding known riboswitch-mediated gene-regulatory mechanisms. The results infer that riboswitches exist wherein the SDS is not embedded inside a stable fold., (© The Author(s) 2023. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2023
- Full Text
- View/download PDF
22. LazySampling and LinearSampling: fast stochastic sampling of RNA secondary structure with applications to SARS-CoV-2.
- Author
-
Zhang H, Li S, Zhang L, Mathews DH, and Huang L
- Subjects
- Humans, Base Sequence, RNA, Viral genetics, RNA, Viral chemistry, Nucleic Acid Conformation, COVID-19 diagnosis, COVID-19 genetics, SARS-CoV-2 genetics, Algorithms
- Abstract
Many RNAs fold into multiple structures at equilibrium, and there is a need to sample these structures according to their probabilities in the ensemble. The conventional sampling algorithm suffers from two limitations: (i) the sampling phase is slow due to many repeated calculations; and (ii) the end-to-end runtime scales cubically with the sequence length. These issues make it difficult to be applied to long RNAs, such as the full genomes of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To address these problems, we devise a new sampling algorithm, LazySampling, which eliminates redundant work via on-demand caching. Based on LazySampling, we further derive LinearSampling, an end-to-end linear time sampling algorithm. Benchmarking on nine diverse RNA families, the sampled structures from LinearSampling correlate better with the well-established secondary structures than Vienna RNAsubopt and RNAplfold. More importantly, LinearSampling is orders of magnitude faster than standard tools, being 428× faster (72 s versus 8.6 h) than RNAsubopt on the full genome of SARS-CoV-2 (29 903 nt). The resulting sample landscape correlates well with the experimentally guided secondary structure models, and is closer to the alternative conformations revealed by experimentally driven analysis. Finally, LinearSampling finds 23 regions of 15 nt with high accessibilities in the SARS-CoV-2 genome, which are potential targets for COVID-19 diagnostics and therapeutics., (© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2023
- Full Text
- View/download PDF
23. Isothermal Titration Calorimetry Analysis of a Cooperative Riboswitch Using an Interdependent-Sites Binding Model.
- Author
-
Cavender CE, Schroeder GM, Mathews DH, and Wedekind JE
- Subjects
- Calorimetry methods, Ligands, Protein Binding, Thermodynamics, Riboswitch
- Abstract
Isothermal titration calorimetry (ITC) is a powerful biophysical tool to characterize energetic profiles of biomacromolecular interactions without any alteration of the underlying chemical structures. In this protocol, we describe procedures for performing, analyzing, and interpreting ITC data obtained from a cooperative riboswitch-ligand interaction., (© 2023. This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply.)
- Published
- 2023
- Full Text
- View/download PDF
24. Linear-Time Algorithms for RNA Structure Prediction.
- Author
-
Zhang H, Zhang L, Liu K, Li S, Mathews DH, and Huang L
- Subjects
- Humans, Nucleic Acid Conformation, Base Pairing, Entropy, Computational Biology methods, Sequence Analysis, RNA methods, RNA chemistry, Algorithms
- Abstract
RNA secondary structure prediction is widely used to understand RNA function. Existing dynamic programming-based algorithms, both the classical minimum free energy (MFE) methods and partition function methods, suffer from a major limitation: their runtimes scale cubically with the RNA length, and this slowness limits their use in genome-wide applications. Inspired by incremental parsing for context-free grammars in computational linguistics, we designed linear-time heuristic algorithms, LinearFold and LinearPartition, to approximate the MFE structure, partition function and base pairing probabilities. These programs are orders of magnitude faster than Vienna RNAfold and CONTRAfold on long sequences. More interestingly, LinearFold and LinearPartition lead to more accurate predictions on the longest sequence families for which the structures are well established (16S and 23S Ribosomal RNAs), as well as improved accuracies for long-range base pairs (500 + nucleotides apart). This chapter provides protocols for using LinearFold and LinearPartition for secondary structure prediction., (© 2023. Springer Science+Business Media, LLC, part of Springer Nature.)
- Published
- 2023
- Full Text
- View/download PDF
25. Intrinsically Unstructured Sequences in the mRNA 3' UTR Reduce the Ability of Poly(A) Tail to Enhance Translation.
- Author
-
Lai WC, Zhu M, Belinite M, Ballard G, Mathews DH, and Ermolenko DN
- Subjects
- 5' Untranslated Regions, Eukaryotic Initiation Factor-4G chemistry, Poly(A)-Binding Proteins chemistry, RNA Caps chemistry, Cell-Free System, Triticum, Saccharomyces cerevisiae, Nucleic Acid Conformation, RNA Stability, 3' Untranslated Regions, Poly A chemistry, Protein Biosynthesis
- Abstract
The 5' cap and 3' poly(A) tail of mRNA are known to synergistically stimulate translation initiation via the formation of the cap•eIF4E•eIF4G•PABP•poly(A) complex. Most mRNA sequences have an intrinsic propensity to fold into extensive intramolecular secondary structures that result in short end-to-end distances. The inherent compactness of mRNAs might stabilize the cap•eIF4E•eIF4G•PABP•poly(A) complex and enhance cap-poly(A) translational synergy. Here, we test this hypothesis by introducing intrinsically unstructured sequences into the 5' or 3' UTRs of model mRNAs. We found that the introduction of unstructured sequences into the 3' UTR, but not the 5' UTR, decreases mRNA translation in cell-free wheat germ and yeast extracts without affecting mRNA stability. The observed reduction in protein synthesis results from the diminished ability of the poly(A) tail to stimulate translation. These results suggest that base pair formation by the 3' UTR enhances the cap-poly(A) synergy in translation initiation., Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2022 Elsevier Ltd. All rights reserved.)
- Published
- 2022
- Full Text
- View/download PDF
26. A Test and Refinement of Folding Free Energy Nearest Neighbor Parameters for RNA Including N 6 -Methyladenosine.
- Author
-
Szabat M, Prochota M, Kierzek R, Kierzek E, and Mathews DH
- Subjects
- Entropy, Adenosine analogs & derivatives, Adenosine chemistry, RNA chemistry, RNA Folding
- Abstract
RNA folding free energy change parameters are widely used to predict RNA secondary structure and to design RNA sequences. These parameters include terms for the folding free energies of helices and loops. Although the full set of parameters has only been traditionally available for the four common bases and backbone, it is well known that covalent modifications of nucleotides are widespread in natural RNAs. Covalent modifications are also widely used in engineered sequences. We recently derived a full set of nearest neighbor terms for RNA that includes N
6 -methyladenosine (m6 A). In this work, we test the model using 98 optical melting experiments, matching duplexes with or without N6 -methylation of A. Most experiments place RRACH, the consensus site of N6 -methylation, in a variety of contexts, including helices, bulge loops, internal loops, dangling ends, and terminal mismatches. For matched sets of experiments that include either A or m6 A in the same context, we find that the parameters for m6 A are as accurate as those for A. Across all experiments, the root mean squared deviation between estimated and experimental free energy changes is 0.67 kcal/mol. We used the new experimental data to refine the set of nearest neighbor parameter terms for m6 A. These parameters enable prediction of RNA secondary structures including m6 A, which can be used to model how N6 -methylation of A affects RNA structure., Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests of personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2022 The Authors. Published by Elsevier Ltd.. All rights reserved.)- Published
- 2022
- Full Text
- View/download PDF
27. Generation and functional analysis of defective viral genomes during SARS-CoV-2 infection.
- Author
-
Zhou T, Gilliam NJ, Li S, Spaudau S, Osborn RM, Anderson CS, Mariani TJ, Thakar J, Dewhurst S, Mathews DH, Huang L, and Sun Y
- Abstract
Defective viral genomes (DVGs) have been identified in many RNA viruses as a major factor influencing antiviral immune response and viral pathogenesis. However, the generation and function of DVGs in SARS-CoV-2 infection are less known. In this study, we elucidated DVG generation in SARS-CoV-2 and its relationship with host antiviral immune response. We observed DVGs ubiquitously from RNA-seq datasets of in vitro infections and autopsy lung tissues of COVID-19 patients. Four genomic hotspots were identified for DVG recombination and RNA secondary structures were suggested to mediate DVG formation. Functionally, bulk and single cell RNA-seq analysis indicated the IFN stimulation of SARS-CoV-2 DVGs. We further applied our criteria to the NGS dataset from a published cohort study and observed significantly higher DVG amount and frequency in symptomatic patients than that in asymptomatic patients. Finally, we observed unusually high DVG frequency in one immunosuppressive patient up to 140 days after admitted to hospital due to COVID-19, first-time suggesting an association between DVGs and persistent viral infections in SARS-CoV-2. Together, our findings strongly suggest a critical role of DVGs in modulating host IFN responses and symptom development, calling for further inquiry into the mechanisms of DVG generation and how DVGs modulate host responses and infection outcome during SARS-CoV-2 infection., Importance: Defective viral genomes (DVGs) are ubiquitously generated in many RNA viruses, including SARS-CoV-2. Their interference activity to full-length viruses and IFN stimulation provide them the potential for novel antiviral therapies and vaccine development. SARS-CoV-2 DVGs are generated through the recombination of two discontinuous genomic fragments by viral polymerase complex and the recombination is also one of the major mechanisms for the emergence of new coronaviruses. Focusing on the generation and function of SARS-CoV-2 DVGs, these studies identify new hotspots for non-homologous recombination and strongly suggest that the secondary structures within viral genomes mediate the recombination. Furthermore, these studies provide the first evidence for IFN stimulation activity of de novo DVGs during natural SARS-CoV-2 infection. These findings set up the foundation for further mechanism studies of SARS-CoV-2 recombination and provide the evidence to harness DVGs’ immunostimulatory potential in the development of vaccine and antivirals for SARS-CoV-2.
- Published
- 2022
- Full Text
- View/download PDF
28. Deep learning models for RNA secondary structure prediction (probably) do not generalize across families.
- Author
-
Szikszai M, Wise M, Datta A, Ward M, and Mathews DH
- Subjects
- Humans, Neural Networks, Computer, Protein Structure, Secondary, Machine Learning, RNA, Deep Learning
- Abstract
Motivation: The secondary structure of RNA is of importance to its function. Over the last few years, several papers attempted to use machine learning to improve de novo RNA secondary structure prediction. Many of these papers report impressive results for intra-family predictions but seldom address the much more difficult (and practical) inter-family problem., Results: We demonstrate that it is nearly trivial with convolutional neural networks to generate pseudo-free energy changes, modelled after structure mapping data that improve the accuracy of structure prediction for intra-family cases. We propose a more rigorous method for inter-family cross-validation that can be used to assess the performance of learning-based models. Using this method, we further demonstrate that intra-family performance is insufficient proof of generalization despite the widespread assumption in the literature and provide strong evidence that many existing learning-based models have not generalized inter-family., Availability and Implementation: Source code and data are available at https://github.com/marcellszi/dl-rna., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2022. Published by Oxford University Press.)
- Published
- 2022
- Full Text
- View/download PDF
29. Quantitative prediction of variant effects on alternative splicing in MAPT using endogenous pre-messenger RNA structure probing.
- Author
-
Kumar J, Lackey L, Waldern JM, Dey A, Mustoe AM, Weeks KM, Mathews DH, and Laederach A
- Subjects
- Exons, Introns, Mutation, RNA Splice Sites, RNA Splicing, RNA, Messenger genetics, Alternative Splicing, RNA Precursors genetics, RNA Precursors metabolism
- Abstract
Splicing is highly regulated and is modulated by numerous factors. Quantitative predictions for how a mutation will affect precursor mRNA (pre-mRNA) structure and downstream function are particularly challenging. Here, we use a novel chemical probing strategy to visualize endogenous precursor and mature MAPT mRNA structures in cells. We used these data to estimate Boltzmann suboptimal structural ensembles, which were then analyzed to predict consequences of mutations on pre-mRNA structure. Further analysis of recent cryo-EM structures of the spliceosome at different stages of the splicing cycle revealed that the footprint of the B
act complex with pre-mRNA best predicted alternative splicing outcomes for exon 10 inclusion of the alternatively spliced MAPT gene, achieving 74% accuracy. We further developed a β-regression weighting framework that incorporates splice site strength, RNA structure, and exonic/intronic splicing regulatory elements capable of predicting, with 90% accuracy, the effects of 47 known and 6 newly discovered mutations on inclusion of exon 10 of MAPT . This combined experimental and computational framework represents a path forward for accurate prediction of splicing-related disease-causing variants., Competing Interests: JK, LL, JW, AD, AM, DM, AL No competing interests declared, KW is an advisor to and holds equity in Ribometrix, (© 2022, Kumar et al.)- Published
- 2022
- Full Text
- View/download PDF
30. Nearest neighbor rules for RNA helix folding thermodynamics: improved end effects.
- Author
-
Zuber J, Schroeder SJ, Sun H, Turner DH, and Mathews DH
- Subjects
- Base Sequence, Nucleic Acid Conformation, Thermodynamics, RNA chemistry, RNA Folding
- Abstract
Nearest neighbor parameters for estimating the folding stability of RNA secondary structures are in widespread use. For helices, current parameters penalize terminal AU base pairs relative to terminal GC base pairs. We curated an expanded database of helix stabilities determined by optical melting experiments. Analysis of the updated database shows that terminal penalties depend on the sequence identity of the adjacent penultimate base pair. New nearest neighbor parameters that include this additional sequence dependence accurately predict the measured values of 271 helices in an updated database with a correlation coefficient of 0.982. This refined understanding of helix ends facilitates fitting terms for base pair stacks with GU pairs. Prior parameter sets treated 5'GGUC3' paired to 3'CUGG5' separately from other 5'GU3'/3'UG5' stacks. The improved understanding of helix end stability, however, makes the separate treatment unnecessary. Introduction of the additional terms was tested with three optical melting experiments. The average absolute difference between measured and predicted free energy changes at 37°C for these three duplexes containing terminal adjacent AU and GU pairs improved from 1.38 to 0.27 kcal/mol. This confirms the need for the additional sequence dependence in the model., (© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2022
- Full Text
- View/download PDF
31. Pre-mRNA splicing factor U2AF2 recognizes distinct conformations of nucleotide variants at the center of the pre-mRNA splice site signal.
- Author
-
Glasser E, Maji D, Biancon G, Puthenpeedikakkal AMK, Cavender CE, Tebaldi T, Jenkins JL, Mathews DH, Halene S, and Kielkopf CL
- Subjects
- Humans, RNA metabolism, RNA Splicing, Uridine metabolism, Nucleotides metabolism, RNA Precursors metabolism, Splicing Factor U2AF metabolism
- Abstract
The essential pre-mRNA splicing factor U2AF2 (also called U2AF65) identifies polypyrimidine (Py) tract signals of nascent transcripts, despite length and sequence variations. Previous studies have shown that the U2AF2 RNA recognition motifs (RRM1 and RRM2) preferentially bind uridine-rich RNAs. Nonetheless, the specificity of the RRM1/RRM2 interface for the central Py tract nucleotide has yet to be investigated. We addressed this question by determining crystal structures of U2AF2 bound to a cytidine, guanosine, or adenosine at the central position of the Py tract, and compared U2AF2-bound uridine structures. Local movements of the RNA site accommodated the different nucleotides, whereas the polypeptide backbone remained similar among the structures. Accordingly, molecular dynamics simulations revealed flexible conformations of the central, U2AF2-bound nucleotide. The RNA binding affinities and splicing efficiencies of structure-guided mutants demonstrated that U2AF2 tolerates nucleotide substitutions at the central position of the Py tract. Moreover, enhanced UV-crosslinking and immunoprecipitation of endogenous U2AF2 in human erythroleukemia cells showed uridine-sensitive binding sites, with lower sequence conservation at the central nucleotide positions of otherwise uridine-rich, U2AF2-bound splice sites. Altogether, these results highlight the importance of RNA flexibility for protein recognition and take a step towards relating splice site motifs to pre-mRNA splicing efficiencies., (© The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2022
- Full Text
- View/download PDF
32. Secondary structure prediction for RNA sequences including N 6 -methyladenosine.
- Author
-
Kierzek E, Zhang X, Watson RM, Kennedy SD, Szabat M, Kierzek R, and Mathews DH
- Subjects
- Adenosine analogs & derivatives, Base Sequence, Humans, Nucleic Acid Conformation, Thermodynamics, RNA chemistry, Software
- Abstract
There is increasing interest in the roles of covalently modified nucleotides in RNA. There has been, however, an inability to account for modifications in secondary structure prediction because of a lack of software and thermodynamic parameters. We report the solution for these issues for N
6 -methyladenosine (m6 A), allowing secondary structure prediction for an alphabet of A, C, G, U, and m6 A. The RNAstructure software now works with user-defined nucleotide alphabets of any size. We also report a set of nearest neighbor parameters for helices and loops containing m6 A, using experiments. Interestingly, N6 -methylation decreases folding stability for adenosines in the middle of a helix, has little effect on folding stability for adenosines at the ends of helices, and increases folding stability for unpaired adenosines stacked on a helix. We demonstrate predictions for an N6 -methylation-activated protein recognition site from MALAT1 and human transcriptome-wide effects of N6 -methylation on the probability of adenosine being buried in a helix., (© 2022. The Author(s).)- Published
- 2022
- Full Text
- View/download PDF
33. Secondary Structure of Influenza A Virus Genomic Segment 8 RNA Folded in a Cellular Environment.
- Author
-
Szutkowska B, Wieczorek K, Kierzek R, Zmora P, Peterson JM, Moss WN, Mathews DH, and Kierzek E
- Subjects
- Animals, Base Sequence, Dogs, Humans, Influenza A Virus, H1N1 Subtype metabolism, Influenza, Human virology, Madin Darby Canine Kidney Cells, Models, Molecular, Nucleotide Motifs genetics, RNA Folding, RNA, Viral genetics, Viral Proteins genetics, Viral Proteins metabolism, Gene Expression Regulation, Viral, Genome, Viral genetics, Influenza A Virus, H1N1 Subtype genetics, Nucleic Acid Conformation, RNA, Viral chemistry
- Abstract
Influenza A virus (IAV) is a member of the single-stranded RNA (ssRNA) family of viruses. The most recent global pandemic caused by the SARS-CoV-2 virus has shown the major threat that RNA viruses can pose to humanity. In comparison, influenza has an even higher pandemic potential as a result of its high rate of mutations within its relatively short (<13 kbp) genome, as well as its capability to undergo genetic reassortment. In light of this threat, and the fact that RNA structure is connected to a broad range of known biological functions, deeper investigation of viral RNA (vRNA) structures is of high interest. Here, for the first time, we propose a secondary structure for segment 8 vRNA (vRNA8) of A/California/04/2009 (H1N1) formed in the presence of cellular and viral components. This structure shows similarities with prior in vitro experiments. Additionally, we determined the location of several well-defined, conserved structural motifs of vRNA8 within IAV strains with possible functionality. These RNA motifs appear to fold independently of regional nucleoprotein (NP)-binding affinity, but a low or uneven distribution of NP in each motif region is noted. This research also highlights several accessible sites for oligonucleotide tools and small molecules in vRNA8 in a cellular environment that might be a target for influenza A virus inhibition on the RNA level.
- Published
- 2022
- Full Text
- View/download PDF
34. Specific length and structure rather than high thermodynamic stability enable regulatory mRNA stem-loops to pause translation.
- Author
-
Bao C, Zhu M, Nykonchuk I, Wakabayashi H, Mathews DH, and Ermolenko DN
- Subjects
- Bacterial Proteins genetics, DNA Polymerase III genetics, Escherichia coli genetics, Fluorescence Resonance Energy Transfer, HIV genetics, Nucleic Acid Conformation, RNA, Bacterial chemistry, RNA, Bacterial metabolism, RNA, Messenger metabolism, RNA, Transfer metabolism, RNA, Viral chemistry, RNA, Viral metabolism, Single Molecule Imaging, Thermodynamics, Frameshifting, Ribosomal, RNA, Messenger chemistry
- Abstract
Translating ribosomes unwind mRNA secondary structures by three basepairs each elongation cycle. Despite the ribosome helicase, certain mRNA stem-loops stimulate programmed ribosomal frameshift by inhibiting translation elongation. Here, using mutagenesis, biochemical and single-molecule experiments, we examine whether high stability of three basepairs, which are unwound by the translating ribosome, is critical for inducing ribosome pauses. We find that encountering frameshift-inducing mRNA stem-loops from the E. coli dnaX mRNA and the gag-pol transcript of Human Immunodeficiency Virus (HIV) hinders A-site tRNA binding and slows down ribosome translocation by 15-20 folds. By contrast, unwinding of first three basepairs adjacent to the mRNA entry channel slows down the translating ribosome by only 2-3 folds. Rather than high thermodynamic stability, specific length and structure enable regulatory mRNA stem-loops to stall translation by forming inhibitory interactions with the ribosome. Our data provide the basis for rationalizing transcriptome-wide studies of translation and searching for novel regulatory mRNA stem-loops., (© 2022. The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
35. A small RNA that cooperatively senses two stacked metabolites in one pocket for gene control.
- Author
-
Schroeder GM, Cavender CE, Blau ME, Jenkins JL, Mathews DH, and Wedekind JE
- Subjects
- Base Pairing, Cloning, Molecular, Crystallography, X-Ray, Escherichia coli genetics, Escherichia coli metabolism, Gene Expression, Gene Expression Regulation, Bacterial, Genetic Vectors chemistry, Genetic Vectors metabolism, Green Fluorescent Proteins genetics, Green Fluorescent Proteins metabolism, Neisseria gonorrhoeae metabolism, Nucleic Acid Conformation, Nucleoside Q biosynthesis, Pyrimidinones metabolism, Pyrroles metabolism, RNA, Bacterial genetics, RNA, Bacterial metabolism, RNA, Messenger genetics, RNA, Messenger metabolism, Recombinant Proteins chemistry, Recombinant Proteins genetics, Recombinant Proteins metabolism, Neisseria gonorrhoeae genetics, Nucleoside Q chemistry, Pyrimidinones chemistry, Pyrroles chemistry, RNA, Bacterial chemistry, RNA, Messenger chemistry, Riboswitch
- Abstract
Riboswitches are structured non-coding RNAs often located upstream of essential genes in bacterial messenger RNAs. Such RNAs regulate expression of downstream genes by recognizing a specific cellular effector. Although nearly 50 riboswitch classes are known, only a handful recognize multiple effectors. Here, we report the 2.60-Å resolution co-crystal structure of a class I type I preQ
1 -sensing riboswitch that reveals two effectors stacked atop one another in a single binding pocket. These effectors bind with positive cooperativity in vitro and both molecules are necessary for gene regulation in bacterial cells. Stacked effector recognition appears to be a hallmark of the largest subgroup of preQ1 riboswitches, including those from pathogens such as Neisseria gonorrhoeae. We postulate that binding to stacked effectors arose in the RNA World to closely position two substrates for RNA-mediated catalysis. These findings expand known effector recognition capabilities of riboswitches and have implications for antimicrobial development., (© 2022. The Author(s).)- Published
- 2022
- Full Text
- View/download PDF
36. LinearTurboFold: Linear-time global prediction of conserved structures for RNA homologs with applications to SARS-CoV-2.
- Author
-
Li S, Zhang H, Zhang L, Liu K, Liu B, Mathews DH, and Huang L
- Subjects
- Betacoronavirus chemistry, Betacoronavirus genetics, Conserved Sequence, Genome, Viral, Mutation, Nucleic Acid Conformation, RNA Folding, RNA, Viral genetics, SARS-CoV-2 genetics, Sequence Alignment, Algorithms, RNA, Viral chemistry, SARS-CoV-2 chemistry
- Abstract
The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single-sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurboFold's purely in silico prediction not only is close to experimentally guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5' and 3' untranslated regions (UTRs) (∼29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies undiscovered conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, small interfering RNAs (siRNAs), CRISPR-Cas13 guide RNAs, and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies and will be a useful tool in fighting the current and future pandemics., Competing Interests: The authors declare no competing interest., (Copyright © 2021 the Author(s). Published by PNAS.)
- Published
- 2021
- Full Text
- View/download PDF
37. LazySampling and LinearSampling: Fast Stochastic Sampling of RNA Secondary Structure with Applications to SARS-CoV-2.
- Author
-
Zhang H, Zhang L, Li S, Mathews DH, and Huang L
- Abstract
Many RNAs fold into multiple structures at equilibrium. The classical stochastic sampling algorithm can sample secondary structures according to their probabilities in the Boltzmann ensemble, and is widely used. However, this algorithm, consisting of a bottom-up partition function phase followed by a top-down sampling phase, suffers from three limitations: (a) the formulation and implementation of the sampling phase are unnecessarily complicated; (b) the sampling phase repeatedly recalculates many redundant recursions already done during the partition function phase; (c) the partition function runtime scales cubically with the sequence length. These issues prevent stochastic sampling from being used for very long RNAs such as the full genomes of SARS-CoV-2. To address these problems, we first adopt a hypergraph framework under which the sampling algorithm can be greatly simplified. We then present three sampling algorithms under this framework, among which the LazySampling algorithm is the fastest by eliminating redundant work in the sampling phase via on-demand caching. Based on LazySampling, we further replace the cubic-time partition function by a linear-time approximate one, and derive LinearSampling, an end-to-end linear-time sampling algorithm that is orders of magnitude faster than the standard one. For instance, LinearSampling is 176Ã- faster (38.9s vs. 1.9h) than Vienna RNAsubopt on the full genome of Ebola virus (18,959 nt ). More importantly, LinearSampling is the first RNA structure sampling algorithm to scale up to the full-genome of SARS-CoV-2 without local window constraints, taking only 69.2 seconds on its reference sequence (29,903 nt ). The resulting sample correlates well with the experimentally-guided structures. On the SARS-CoV-2 genome, LinearSampling finds 23 regions of 15 nt with high accessibilities, which are potential targets for COVID-19 diagnostics and drug design. See code: https://github.com/LinearFold/LinearSampling.
- Published
- 2021
- Full Text
- View/download PDF
38. Making ends meet: New functions of mRNA secondary structure.
- Author
-
Ermolenko DN and Mathews DH
- Subjects
- Protein Biosynthesis, RNA, Messenger genetics, RNA, Messenger metabolism, RNA, RNA Stability
- Abstract
The 5' cap and 3' poly(A) tail of mRNA are known to synergistically regulate mRNA translation and stability. Recent computational and experimental studies revealed that both protein-coding and non-coding RNAs will fold with extensive intramolecular secondary structure, which will result in close distances between the sequence ends. This proximity of the ends is a sequence-independent, universal property of most RNAs. Only low-complexity sequences without guanosines are without secondary structure and exhibit end-to-end distances expected for RNA random coils. The innate proximity of RNA ends might have important biological implications that remain unexplored. In particular, the inherent compactness of mRNA might regulate translation initiation by facilitating the formation of protein complexes that bridge mRNA 5' and 3' ends. Additionally, the proximity of mRNA ends might mediate coupling of 3' deadenylation to 5' end mRNA decay. This article is categorized under: RNA Structure and Dynamics > RNA Structure, Dynamics, and Chemistry RNA Structure and Dynamics > Influence of RNA Structure in Biological Systems Translation > Translation Regulation., (© 2020 Wiley Periodicals LLC.)
- Published
- 2021
- Full Text
- View/download PDF
39. Inverse RNA Folding Workflow to Design and Test Ribozymes that Include Pseudoknots.
- Author
-
Kayedkhordeh M, Yamagami R, Bevilacqua PC, and Mathews DH
- Subjects
- Base Sequence, G-Quadruplexes, In Vitro Techniques, Kinetics, Models, Molecular, Nucleic Acid Conformation, Nucleotide Motifs genetics, RNA Folding genetics, RNA, Catalytic chemistry, Software, Transcription, Genetic, Computational Biology methods, Hepatitis Delta Virus genetics, Hepatitis Delta Virus metabolism, RNA, Catalytic genetics, RNA, Catalytic metabolism
- Abstract
Ribozymes are RNAs that catalyze reactions. They occur in nature, and can also be evolved in vitro to catalyze novel reactions. This chapter provides detailed protocols for using inverse folding software to design a ribozyme sequence that will fold to a known ribozyme secondary structure and for testing the catalytic activity of the sequence experimentally. This protocol is able to design sequences that include pseudoknots, which is important as all naturally occurring full-length ribozymes have pseudoknots. The starting point is the known pseudoknot-containing secondary structure of the ribozyme and knowledge of any nucleotides whose identity is required for function. The output of the protocol is a set of sequences that have been tested for function. Using this protocol, we were previously successful at designing highly active double-pseudoknotted HDV ribozymes.
- Published
- 2021
- Full Text
- View/download PDF
40. Arginine Forks Are a Widespread Motif to Recognize Phosphate Backbones and Guanine Nucleobases in the RNA Major Groove.
- Author
-
Chavali SS, Cavender CE, Mathews DH, and Wedekind JE
- Subjects
- Arginine chemistry, Binding Sites, Guanine chemistry, HIV Long Terminal Repeat genetics, HIV-1 metabolism, Nucleic Acid Conformation, Phosphates chemistry, Phosphates metabolism, RNA, Viral chemistry, tat Gene Products, Human Immunodeficiency Virus genetics, tat Gene Products, Human Immunodeficiency Virus metabolism, Arginine metabolism, Guanine metabolism, RNA, Viral metabolism
- Abstract
RNA recognition by proteins is central to biology. Here we demonstrate the existence of a recurrent structural motif, the "arginine fork", that codifies arginine readout of cognate backbone and guanine nucleobase interactions in a variety of protein-RNA complexes derived from viruses, metabolic enzymes, and ribosomes. Nearly 30 years ago, a theoretical arginine fork model was posited to account for the specificity between the HIV-1 Tat protein and TAR RNA. This model predicted that a single arginine should form four complementary contacts with nearby phosphates, yielding a two-pronged backbone readout. Recent high-resolution structures of TAR-protein complexes have unveiled new details, including ( i ) arginine interactions with the phosphate backbone and the major-groove edge of guanine and ( ii ) simultaneous cation-π contacts between the guanidinium group and flanking nucleobases. These findings prompted us to search for arginine forks within experimental protein-RNA structures retrieved from the Protein Data Bank. The results revealed four distinct classes of arginine forks that we have defined using a rigorous but flexible nomenclature. Examples are presented in the context of ribosomal and nonribosomal interfaces with analysis of arginine dihedral angles and structural (suite) classification of RNA targets. When arginine fork chemical recognition principles were applied to existing structures with unusual arginine-guanine recognition, we found that the arginine fork geometry was more consistent with the experimental data, suggesting the utility of fork classifications to improve structural models. Software to analyze arginine-RNA interactions has been made available to the community.
- Published
- 2020
- Full Text
- View/download PDF
41. Analysis of a preQ1-I riboswitch in effector-free and bound states reveals a metabolite-programmed nucleobase-stacking spine that controls gene regulation.
- Author
-
Schroeder GM, Dutta D, Cavender CE, Jenkins JL, Pritchett EM, Baker CD, Ashton JM, Mathews DH, and Wedekind JE
- Subjects
- Base Pairing, Gene Expression Regulation, Bacterial, Guanine analogs & derivatives, Sodium Dodecyl Sulfate chemistry, Thermoanaerobacter genetics, Molecular Dynamics Simulation, Riboswitch
- Abstract
Riboswitches are structured RNA motifs that recognize metabolites to alter the conformations of downstream sequences, leading to gene regulation. To investigate this molecular framework, we determined crystal structures of a preQ1-I riboswitch in effector-free and bound states at 2.00 Å and 2.65 Å-resolution. Both pseudoknots exhibited the elusive L2 loop, which displayed distinct conformations. Conversely, the Shine-Dalgarno sequence (SDS) in the S2 helix of each structure remained unbroken. The expectation that the effector-free state should expose the SDS prompted us to conduct solution experiments to delineate environmental changes to specific nucleobases in response to preQ1. We then used nudged elastic band computational methods to derive conformational-change pathways linking the crystallographically-determined effector-free and bound-state structures. Pathways featured: (i) unstacking and unpairing of L2 and S2 nucleobases without preQ1-exposing the SDS for translation and (ii) stacking and pairing L2 and S2 nucleobases with preQ1-sequestering the SDS. Our results reveal how preQ1 binding reorganizes L2 into a nucleobase-stacking spine that sequesters the SDS, linking effector recognition to biological function. The generality of stacking spines as conduits for effector-dependent, interdomain communication is discussed in light of their existence in adenine riboswitches, as well as the turnip yellow mosaic virus ribosome sensor., (© The Author(s) 2020. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2020
- Full Text
- View/download PDF
42. LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities.
- Author
-
Zhang H, Zhang L, Mathews DH, and Huang L
- Subjects
- Algorithms, Base Pairing, Humans, Nucleic Acid Conformation, Probability, Sequence Analysis, RNA, RNA genetics, RNA Folding
- Abstract
Motivation: RNA secondary structure prediction is widely used to understand RNA function. Recently, there has been a shift away from the classical minimum free energy methods to partition function-based methods that account for folding ensembles and can therefore estimate structure and base pair probabilities. However, the classical partition function algorithm scales cubically with sequence length, and is therefore prohibitively slow for long sequences. This slowness is even more severe than cubic-time free energy minimization due to a substantially larger constant factor in runtime., Results: Inspired by the success of our recent LinearFold algorithm that predicts the approximate minimum free energy structure in linear time, we design a similar linear-time heuristic algorithm, LinearPartition, to approximate the partition function and base-pairing probabilities, which is shown to be orders of magnitude faster than Vienna RNAfold and CONTRAfold (e.g. 2.5 days versus 1.3 min on a sequence with length 32 753 nt). More interestingly, the resulting base-pairing probabilities are even better correlated with the ground-truth structures. LinearPartition also leads to a small accuracy improvement when used for downstream structure prediction on families with the longest length sequences (16S and 23S rRNAs), as well as a substantial improvement on long-distance base pairs (500+ nt apart)., Availability and Implementation: Code: http://github.com/LinearFold/LinearPartition; Server: http://linearfold.org/partition., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2020. Published by Oxford University Press.)
- Published
- 2020
- Full Text
- View/download PDF
43. Determining parameters for non-linear models of multi-loop free energy change.
- Author
-
Ward M, Sun H, Datta A, Wise M, and Mathews DH
- Subjects
- Algorithms, Nucleic Acid Conformation, RNA, Software, Nonlinear Dynamics
- Abstract
Motivation: Predicting the secondary structure of RNA is a fundamental task in bioinformatics. Algorithms that predict secondary structure given only the primary sequence, and a model to evaluate the quality of a structure, are an integral part of this. These algorithms have been updated as our model of RNA thermodynamics changed and expanded. An exception to this has been the treatment of multi-loops. Although more advanced models of multi-loop free energy change have been suggested, a simple, linear model has been used since the 1980s. However, recently, new dynamic programing algorithms for secondary structure prediction that could incorporate these models were presented. Unfortunately, these models appear to have lower accuracy for secondary structure prediction., Results: We apply linear regression and a new parameter optimization algorithm to find better parameters for the existing linear model and advanced non-linear multi-loop models. These include the Jacobson-Stockmayer and Aalberts & Nandagopal models. We find that the current linear model parameters may be near optimal for the linear model, and that no advanced model performs better than the existing linear model parameters even after parameter optimization., Availability and Implementation: Source code and data is available at https://github.com/maxhwardg/advanced_multiloops., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.)
- Published
- 2019
- Full Text
- View/download PDF
44. CRISPR-Cas9-based mutagenesis frequently provokes on-target mRNA misregulation.
- Author
-
Tuladhar R, Yeu Y, Tyler Piazza J, Tan Z, Rene Clemenceau J, Wu X, Barrett Q, Herbert J, Mathews DH, Kim J, Hyun Hwang T, and Lum L
- Subjects
- Amino Acid Sequence, Base Sequence, CRISPR-Cas Systems, Cell Line, Cell Line, Tumor, Codon, Nonsense genetics, Frameshift Mutation, Gene Expression Regulation, Neoplastic, Gene Knockout Techniques, HeLa Cells, Humans, INDEL Mutation, RNA Stability, RNA, Messenger chemistry, Gene Editing methods, Mutagenesis, RNA, Messenger genetics
- Abstract
The introduction of insertion-deletions (INDELs) by non-homologous end-joining (NHEJ) pathway underlies the mechanistic basis of CRISPR-Cas9-directed genome editing. Selective gene ablation using CRISPR-Cas9 is achieved by installation of a premature termination codon (PTC) from a frameshift-inducing INDEL that elicits nonsense-mediated decay (NMD) of the mutant mRNA. Here, by examining the mRNA and protein products of CRISPR targeted genes in a cell line panel with presumed gene knockouts, we detect the production of foreign mRNAs or proteins in ~50% of the cell lines. We demonstrate that these aberrant protein products stem from the introduction of INDELs that promote internal ribosomal entry, convert pseudo-mRNAs (alternatively spliced mRNAs with a PTC) into protein encoding molecules, or induce exon skipping by disruption of exon splicing enhancers (ESEs). Our results reveal challenges to manipulating gene expression outcomes using INDEL-based mutagenesis and strategies useful in mitigating their impact on intended genome-editing outcomes.
- Published
- 2019
- Full Text
- View/download PDF
45. LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search.
- Author
-
Huang L, Zhang H, Deng D, Zhao K, Liu K, Hendrix DA, and Mathews DH
- Subjects
- Nucleic Acid Conformation, RNA, Sequence Analysis, RNA, Software, RNA Folding
- Abstract
Motivation: Predicting the secondary structure of an ribonucleic acid (RNA) sequence is useful in many applications. Existing algorithms [based on dynamic programming] suffer from a major limitation: their runtimes scale cubically with the RNA length, and this slowness limits their use in genome-wide applications., Results: We present a novel alternative O(n3)-time dynamic programming algorithm for RNA folding that is amenable to heuristics that make it run in O(n) time and O(n) space, while producing a high-quality approximation to the optimal solution. Inspired by incremental parsing for context-free grammars in computational linguistics, our alternative dynamic programming algorithm scans the sequence in a left-to-right (5'-to-3') direction rather than in a bottom-up fashion, which allows us to employ the effective beam pruning heuristic. Our work, though inexact, is the first RNA folding algorithm to achieve linear runtime (and linear space) without imposing constraints on the output structure. Surprisingly, our approximate search results in even higher overall accuracy on a diverse database of sequences with known structures. More interestingly, it leads to significantly more accurate predictions on the longest sequence families in that database (16S and 23S Ribosomal RNAs), as well as improved accuracies for long-range base pairs (500+ nucleotides apart), both of which are well known to be challenging for the current models., Availability and Implementation: Our source code is available at https://github.com/LinearFold/LinearFold, and our webserver is at http://linearfold.org (sequence limit: 100 000nt)., Supplementary Information: Supplementary data are available at Bioinformatics online., (© The Author(s) 2019. Published by Oxford University Press.)
- Published
- 2019
- Full Text
- View/download PDF
46. How to benchmark RNA secondary structure prediction accuracy.
- Author
-
Mathews DH
- Subjects
- Algorithms, Computational Biology standards, RNA genetics, Sequence Analysis, RNA standards, Software, Computational Biology methods, Nucleic Acid Conformation, RNA chemistry, Sequence Analysis, RNA methods
- Abstract
RNA secondary structure prediction is widely used. As new methods are developed, these are often benchmarked for accuracy against existing methods. This review discusses good practices for performing these benchmarks, including the choice of benchmarking structures, metrics to quantify accuracy, the importance of allowing flexibility for pairs in the accepted structure, and the importance of statistical testing for significance., (Copyright © 2019. Published by Elsevier Inc.)
- Published
- 2019
- Full Text
- View/download PDF
47. Estimating uncertainty in predicted folding free energy changes of RNA secondary structures.
- Author
-
Zuber J and Mathews DH
- Subjects
- Base Pairing, Base Sequence, Humans, Thermodynamics, Uncertainty, Algorithms, RNA chemistry, RNA Folding, Software
- Abstract
Nearest neighbor parameters for estimating the folding stability of RNA are commonly used in secondary structure prediction, for generating folding ensembles of structures, and for analyzing RNA function. Previously, we demonstrated that we could quantify the uncertainties in each nearest neighbor parameter by perturbing the underlying optical melting data within experimental error and rederiving the parameters, which accounts for the substantial correlations that exist between the parameters. In this contribution, we describe a method to estimate uncertainty in the estimated folding stabilities of RNA structures, accounting for correlations in the nearest neighbor parameters. This method is incorporated in the RNA structure software package., (© 2019 Zuber and Mathews; Published by Cold Spring Harbor Laboratory Press for the RNA Society.)
- Published
- 2019
- Full Text
- View/download PDF
48. Conservation of location of several specific inhibitory codon pairs in the Saccharomyces sensu stricto yeasts reveals translational selection.
- Author
-
Ghoneim DH, Zhang X, Brule CE, Mathews DH, and Grayhack EJ
- Subjects
- Base Sequence, Candida genetics, Conserved Sequence, Genes, Fungal, Saccharomyces cerevisiae genetics, Codon, Protein Biosynthesis, Saccharomyces genetics
- Abstract
Synonymous codons provide redundancy in the genetic code that influences translation rates in many organisms, in which overall codon use is driven by selection for optimal codons. It is unresolved if or to what extent translational selection drives use of suboptimal codons or codon pairs. In Saccharomyces cerevisiae, 17 specific inhibitory codon pairs, each comprised of adjacent suboptimal codons, inhibit translation efficiency in a manner distinct from their constituent codons, and many are translated slowly in native genes. We show here that selection operates within Saccharomyces sensu stricto yeasts to conserve nine of these codon pairs at defined positions in genes. Conservation of these inhibitory codon pairs is significantly greater than expected, relative to conservation of their constituent codons, with seven pairs more highly conserved than any other synonymous pair. Conservation is strongly correlated with slow translation of the pairs. Conservation of suboptimal codon pairs extends to two related Candida species, fungi that diverged from Saccharomyces ∼270 million years ago, with an enrichment for codons decoded by I•A and U•G wobble in both Candida and Saccharomyces. Thus, conservation of inhibitory codon pairs strongly implies selection for slow translation at particular gene locations, executed by suboptimal codon pairs., (© The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.)
- Published
- 2019
- Full Text
- View/download PDF
49. Design of highly active double-pseudoknotted ribozymes: a combined computational and experimental study.
- Author
-
Yamagami R, Kayedkhordeh M, Mathews DH, and Bevilacqua PC
- Subjects
- Algorithms, Biotechnology trends, Kinetics, Nucleic Acid Conformation, RNA, Catalytic genetics, Synthetic Biology trends, Computational Biology methods, RNA Folding, RNA, Catalytic chemistry, Riboswitch genetics
- Abstract
Design of RNA sequences that adopt functional folds establishes principles of RNA folding and applications in biotechnology. Inverse folding for RNAs, which allows computational design of sequences that adopt specific structures, can be utilized for unveiling RNA functions and developing genetic tools in synthetic biology. Although many algorithms for inverse RNA folding have been developed, the pseudoknot, which plays a key role in folding of ribozymes and riboswitches, is not addressed in most algorithms. For the few algorithms that attempt to predict pseudoknot-containing ribozymes, self-cleavage activity has not been tested. Herein, we design double-pseudoknot HDV ribozymes using an inverse RNA folding algorithm and test their kinetic mechanisms experimentally. More than 90% of the positively designed ribozymes possess self-cleaving activity, whereas more than 70% of negative control ribozymes, which are predicted to fold to the necessary structure but with low fidelity, do not possess it. Kinetic and mutation analyses reveal that these RNAs cleave site-specifically and with the same mechanism as the WT ribozyme. Most ribozymes react just 50- to 80-fold slower than the WT ribozyme, and this rate can be improved to near WT by modification of a junction. Thus, fast-cleaving functional ribozymes with multiple pseudoknots can be designed computationally.
- Published
- 2019
- Full Text
- View/download PDF
50. Identification of new high affinity targets for Roquin based on structural conservation.
- Author
-
Braun J, Fischer S, Xu ZZ, Sun H, Ghoneim DH, Gimbel AT, Plessmann U, Urlaub H, Mathews DH, and Weigand JE
- Subjects
- 3' Untranslated Regions, Animals, Binding Sites, Cell Line, Computer Simulation, DNA Mutational Analysis, HEK293 Cells, HeLa Cells, Humans, Mice, Nucleic Acid Conformation, Nucleotides genetics, Protein Binding, RNA, Messenger metabolism, RNA-Binding Proteins genetics, Transcription, Genetic, Ubiquitin-Protein Ligases genetics, Computational Biology methods, Gene Expression Regulation, RNA Folding, RNA-Binding Proteins chemistry, Ubiquitin-Protein Ligases chemistry
- Abstract
Post-transcriptional gene regulation controls the amount of protein produced from a specific mRNA by altering both its decay and translation rates. Such regulation is primarily achieved by the interaction of trans-acting factors with cis-regulatory elements in the untranslated regions (UTRs) of mRNAs. These interactions are guided either by sequence- or structure-based recognition. Similar to sequence conservation, the evolutionary conservation of a UTR's structure thus reflects its functional importance. We used such structural conservation to identify previously unknown cis-regulatory elements. Using the RNA folding program Dynalign, we scanned all UTRs of humans and mice for conserved structures. Characterizing a subset of putative conserved structures revealed a binding site of the RNA-binding protein Roquin. Detailed functional characterization in vivo enabled us to redefine the binding preferences of Roquin and identify new target genes. Many of these new targets are unrelated to the established role of Roquin in inflammation and immune responses and thus highlight additional, unstudied cellular functions of this important repressor. Moreover, the expression of several Roquin targets is highly cell-type-specific. In consequence, these targets are difficult to detect using methods dependent on mRNA abundance, yet easily detectable with our unbiased strategy.
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.