1. LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property
- Author
-
Siyu Han, Wei Du, Yangyi Xu, Yanchun Liang, Ying Li, Cankun Wang, Qin Ma, and Yu Zhang
- Subjects
Paper ,Web server ,Computer science ,0206 medical engineering ,02 engineering and technology ,computer.software_genre ,Machine learning ,EIIP physicochemical property ,Machine Learning ,03 medical and health sciences ,Animals ,Humans ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,business.industry ,sequence intrinsic composition ,Computational Biology ,Multiple species ,Long non-coding RNA ,R package ,multi-scale secondary structure ,Nucleic Acid Conformation ,RNA, Long Noncoding ,Artificial intelligence ,business ,predictive modeling ,computer ,Classifier (UML) ,Algorithms ,020602 bioinformatics ,Information Systems - Abstract
Discovering new long non-coding RNAs (lncRNAs) has been a fundamental step in lncRNA-related research. Nowadays, many machine learning-based tools have been developed for lncRNA identification. However, many methods predict lncRNAs using sequence-derived features alone, which tend to display unstable performances on different species. Moreover, the majority of tools cannot be re-trained or tailored by users and neither can the features be customized or integrated to meet researchers’ requirements. In this study, features extracted from sequence-intrinsic composition, secondary structure and physicochemical property are comprehensively reviewed and evaluated. An integrated platform named LncFinder is also developed to enhance the performance and promote the research of lncRNA identification. LncFinder includes a novel lncRNA predictor using the heterologous features we designed. Experimental results show that our method outperforms several state-of-the-art tools on multiple species with more robust and satisfactory results. Researchers can additionally employ LncFinder to extract various classic features, build classifier with numerous machine learning algorithms and evaluate classifier performance effectively and efficiently. LncFinder can reveal the properties of lncRNA and mRNA from various perspectives and further inspire lncRNA–protein interaction prediction and lncRNA evolution analysis. It is anticipated that LncFinder can significantly facilitate lncRNA-related research, especially for the poorly explored species. LncFinder is released as R package (https://CRAN.R-project.org/package=LncFinder). A web server (http://bmbl.sdstate.edu/lncfinder/) is also developed to maximize its availability.
- Published
- 2018