Back to Search
Start Over
Accurate Classification of RNA Structures Using Topological Fingerprints.
- Source :
-
PloS one [PLoS One] 2016 Oct 18; Vol. 11 (10), pp. e0164726. Date of Electronic Publication: 2016 Oct 18 (Print Publication: 2016). - Publication Year :
- 2016
-
Abstract
- While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity-an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS&#95;RNA&#95;fingerprint.<br />Competing Interests: The first author, Jiajie Huang, is currently employed by Thermo Fisher Scientific. The second author, Kejie Li, is currently employed by Biogen Idec. These commercial affiliations do not alter the authors' adherence to all PLOS ONE policies on sharing data and materials.
Details
- Language :
- English
- ISSN :
- 1932-6203
- Volume :
- 11
- Issue :
- 10
- Database :
- MEDLINE
- Journal :
- PloS one
- Publication Type :
- Academic Journal
- Accession number :
- 27755571
- Full Text :
- https://doi.org/10.1371/journal.pone.0164726