Back to Search
Start Over
Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data
- Source :
- Nucleic Acids Research
- Publication Year :
- 2015
- Publisher :
- Oxford University Press, 2015.
-
Abstract
- Recently, several experimental techniques have emerged for probing RNA structures based on high-throughput sequencing. However, most secondary structure prediction tools that incorporate probing data are designed and optimized for particular types of experiments. For example, RNAstructure-Fold is optimized for SHAPE data, while SeqFold is optimized for PARS data. Here, we report a new RNA secondary structure prediction method, restrained MaxExpect (RME), which can incorporate multiple types of experimental probing data and is based on a free energy model and an MEA (maximizing expected accuracy) algorithm. We first demonstrated that RME substantially improved secondary structure prediction with perfect restraints (base pair information of known structures). Next, we collected structure-probing data from diverse experiments (e.g. SHAPE, PARS and DMS-seq) and transformed them into a unified set of pairing probabilities with a posterior probabilistic model. By using the probability scores as restraints in RME, we compared its secondary structure prediction performance with two other well-known tools, RNAstructure-Fold (based on a free energy minimization algorithm) and SeqFold (based on a sampling algorithm). For SHAPE data, RME and RNAstructure-Fold performed better than SeqFold, because they markedly altered the energy model with the experimental restraints. For high-throughput data (e.g. PARS and DMS-seq) with lower probing efficiency, the secondary structure prediction performances of the tested tools were comparable, with performance improvements for only a portion of the tested RNAs. However, when the effects of tertiary structure and protein interactions were removed, RME showed the highest prediction accuracy in the DMS-accessible regions by incorporating in vivo DMS-seq data.
- Subjects :
- Molecular Probe Techniques
Biology
Bioinformatics
Energy minimization
Nucleic acid secondary structure
Set (abstract data type)
03 medical and health sciences
0302 clinical medicine
Genetics
Protein secondary structure
030304 developmental biology
0303 health sciences
Models, Statistical
Sampling (statistics)
Computational Biology
Statistical model
Protein tertiary structure
Models, Chemical
Nucleic Acid Conformation
RNA
Thermodynamics
Algorithm
030217 neurology & neurosurgery
Energy (signal processing)
Algorithms
Software
Subjects
Details
- Language :
- English
- ISSN :
- 13624962 and 03051048
- Volume :
- 43
- Issue :
- 15
- Database :
- OpenAIRE
- Journal :
- Nucleic Acids Research
- Accession number :
- edsair.doi.dedup.....5d5e44ae4b553613c914e5cb9a48f7ad