Back to Search
Start Over
SFPEL-LPI: Sequence-based feature projection ensemble learning for predicting LncRNA-protein interactions
- Source :
- PLoS Computational Biology, Vol 14, Iss 12, p e1006616 (2018), PLoS Computational Biology
- Publication Year :
- 2018
- Publisher :
- Public Library of Science (PLoS), 2018.
-
Abstract
- LncRNA-protein interactions play important roles in post-transcriptional gene regulation, poly-adenylation, splicing and translation. Identification of lncRNA-protein interactions helps to understand lncRNA-related activities. Existing computational methods utilize multiple lncRNA features or multiple protein features to predict lncRNA-protein interactions, but features are not available for all lncRNAs or proteins; most of existing methods are not capable of predicting interacting proteins (or lncRNAs) for new lncRNAs (or proteins), which don’t have known interactions. In this paper, we propose the sequence-based feature projection ensemble learning method, “SFPEL-LPI”, to predict lncRNA-protein interactions. First, SFPEL-LPI extracts lncRNA sequence-based features and protein sequence-based features. Second, SFPEL-LPI calculates multiple lncRNA-lncRNA similarities and protein-protein similarities by using lncRNA sequences, protein sequences and known lncRNA-protein interactions. Then, SFPEL-LPI combines multiple similarities and multiple features with a feature projection ensemble learning frame. In computational experiments, SFPEL-LPI accurately predicts lncRNA-protein associations and outperforms other state-of-the-art methods. More importantly, SFPEL-LPI can be applied to new lncRNAs (or proteins). The case studies demonstrate that our method can find out novel lncRNA-protein interactions, which are confirmed by literature. Finally, we construct a user-friendly web server, available at http://www.bioinfotech.cn/SFPEL-LPI/.<br />Author summary LncRNA-protein interactions play important roles in post-transcriptional gene regulation, poly-adenylation, splicing and translation. Identification of lncRNA-protein interactions helps to understand lncRNA-related activities. In this paper, we propose a novel computational method “SFPEL-LPI” to predict lncRNA-protein interactions. SFPEL-LPI makes use of lncRNA sequences, protein sequences and known lncRNA-protein associations to extract features and calculate similarities for lncRNAs and proteins, and then combines them with a feature projection ensemble learning frame. SFPEL-LPI can predict unobserved interactions between lncRNAs and proteins, and also can make predictions for new lncRNAs (or proteins), which have no interactions with any proteins (or lncRNAs). SFPEL-LPI produces high-accuracy performances on the benchmark dataset when evaluated by five-fold cross validation, and outperforms state-of-the-art methods. The case studies demonstrate that SFPEL-LPI can find out novel associations, which are confirmed by literature. To facilitate the lncRNA-protein interaction prediction, we develop a user-friendly web server, available at http://www.bioinfotech.cn/SFPEL-LPI/.
- Subjects :
- Proteomics
0301 basic medicine
Protein Extraction
Computer science
Biochemistry
Machine Learning
Database and Informatics Methods
Mathematical and Statistical Techniques
0302 clinical medicine
Protein sequencing
Feature (machine learning)
RNA Processing, Post-Transcriptional
Biology (General)
Databases, Protein
Projection (set theory)
Peptide sequence
Extraction Techniques
Sequence
Ecology
Gene Ontologies
Statistics
RNA-Binding Proteins
Genomics
Genomic Databases
Nucleic acids
Computational Theory and Mathematics
030220 oncology & carcinogenesis
Modeling and Simulation
Physical Sciences
Protein Interaction Networks
RNA, Long Noncoding
Identification (biology)
Databases, Nucleic Acid
Network Analysis
Algorithms
Research Article
Protein Binding
Computer and Information Sciences
QH301-705.5
Computational biology
Research and Analysis Methods
Protein–protein interaction
03 medical and health sciences
Cellular and Molecular Neuroscience
Genetics
Humans
Amino Acid Sequence
Statistical Methods
Non-coding RNA
Protein Interactions
Molecular Biology
Ecology, Evolution, Behavior and Systematics
Biology and life sciences
Base Sequence
Proteins
Computational Biology
Genome Analysis
Ensemble learning
Biological Databases
030104 developmental biology
Protein-Protein Interactions
Long non-coding RNAs
RNA
Mathematics
Forecasting
Subjects
Details
- Language :
- English
- ISSN :
- 15537358
- Volume :
- 14
- Issue :
- 12
- Database :
- OpenAIRE
- Journal :
- PLoS Computational Biology
- Accession number :
- edsair.doi.dedup.....901273b4ad3e85651009c8acd757f9d0