102 results on '"LncRNA–protein interaction"'
Search Results
2. The Landscape of Long Non-Coding RNA Dysregulation and Clinical Relevance in Muscle Invasive Bladder Urothelial Carcinoma.
- Author
-
Shen, Haotian, Wong, Lindsay, Li, Wei, Chu, Megan, High, Rachel, Chang, Eric, Ongkeko, Weg, and Wang-Rodriguez, Jessica
- Subjects
TCGA ,bladder carcinoma ,lncRNA-protein interaction ,lncRNAs - Abstract
Bladder cancer is one of the most common cancers in the United States, but few advancements in treatment options have occurred in the past few decades. This study aims to identify the most clinically relevant long non-coding RNAs (lncRNAs) to serve as potential biomarkers and treatment targets for muscle invasive bladder cancer (MIBC). Using RNA-sequencing data from 406 patients in The Cancer Genome Atlas (TCGA) database, we identified differentially expressed lncRNAs in MIBC vs. normal tissues. We then associated lncRNA expression with patient survival, clinical variables, oncogenic signatures, cancer- and immune-associated pathways, and genomic alterations. We identified a panel of 20 key lncRNAs that were most implicated in MIBC prognosis after differential expression analysis and prognostic correlations. Almost all lncRNAs we identified are correlated significantly with oncogenic processes. In conclusion, we discovered previously undescribed lncRNAs strongly implicated in the MIBC disease course that may be leveraged for diagnostic and treatment purposes in the future. Functional analysis of these lncRNAs may also reveal distinct mechanisms of bladder cancer carcinogenesis.
- Published
- 2019
3. LPIH2V: LncRNA-protein interactions prediction using HIN2Vec based on heterogeneous networks model.
- Author
-
Meng-Meng Wei, Chang-Qing Yu, Li-Ping Li, Zhu-Hong You, Zhong-Hao Ren, Yong-Jian Guan, Xin-Fei Wang, and Yue-Chao Li
- Subjects
FEATURE extraction ,PROTEIN-protein interactions ,LINCRNA - Abstract
LncRNA-protein interaction plays an important role in the development and treatment of many human diseases. As the experimental approaches to determine lncRNA–protein interactions are expensive and time-consuming, considering that there are few calculation methods, therefore, it is urgent to develop efficient and accurate methods to predict lncRNA-protein interactions. In this work, a model for heterogeneous network embedding based on meta-path, namely LPIH2V, is proposed. The heterogeneous network is composed of lncRNA similarity networks, protein similarity networks, and known lncRNA-protein interaction networks. The behavioral features are extracted in a heterogeneous network using the HIN2Vec method of network embedding. The results showed that LPIH2V obtains an AUC of 0.97 and ACC of 0.95 in the 5-fold cross-validation test. The model successfully showed superiority and good generalization ability. Compared to other models, LPIH2V not only extracts attribute characteristics by similarity, but also acquires behavior properties by meta-path wandering in heterogeneous networks. LPIH2V would be beneficial in forecasting interactions between lncRNA and protein. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. LPInsider: a webserver for lncRNA–protein interaction extraction from the literature
- Author
-
Ying Li, Lizheng Wei, Cankun Wang, Jianing Zhao, Siyu Han, Yu Zhang, and Wei Du
- Subjects
lncRNA–protein interaction ,Corpus ,Named entity recognition ,Multiple text features ,Logistic regression ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Long non-coding RNA (LncRNA) plays important roles in physiological and pathological processes. Identifying LncRNA–protein interactions (LPIs) is essential to understand the molecular mechanism and infer the functions of lncRNAs. With the overwhelming size of the biomedical literature, extracting LPIs directly from the biomedical literature is essential, promising and challenging. However, there is no webserver of LPIs relationship extraction from literature. Results LPInsider is developed as the first webserver for extracting LPIs from biomedical literature texts based on multiple text features (semantic word vectors, syntactic structure vectors, distance vectors, and part of speech vectors) and logistic regression. LPInsider allows researchers to extract LPIs by uploading PMID, PMCID, PMID List, or biomedical text. A manually filtered and highly reliable LPI corpus is integrated in LPInsider. The performance of LPInsider is optimal by comprehensive experiment on different combinations of different feature and machine learning models. Conclusions LPInsider is an efficient analytical tool for LPIs that helps researchers to enhance their comprehension of lncRNAs from text mining, and also saving their time. In addition, LPInsider is freely accessible from http://www.csbg-jlu.info/LPInsider/ with no login requirement. The source code and LPIs corpus can be downloaded from https://github.com/qiufengdiewu/LPInsider .
- Published
- 2022
- Full Text
- View/download PDF
5. RLF-LPI: An ensemble learning framework using sequence information for predicting lncRNA-protein interaction based on AE-ResLSTM and fuzzy decision
- Author
-
Jinmiao Song, Shengwei Tian, Long Yu, Qimeng Yang, Qiguo Dai, Yuanxu Wang, Weidong Wu, and Xiaodong Duan
- Subjects
deep learning ,lncrna-protein interaction ,fuzzy decision ,extra trees ,attention mechanism ,Biotechnology ,TP248.13-248.65 ,Mathematics ,QA1-939 - Abstract
Long non-coding RNAs (lncRNAs) play a regulatory role in many biological cells, and the recognition of lncRNA-protein interactions is helpful to reveal the functional mechanism of lncRNAs. Identification of lncRNA-protein interaction by biological techniques is costly and time-consuming. Here, an ensemble learning framework, RLF-LPI is proposed, to predict lncRNA-protein interactions. The RLF-LPI of the residual LSTM autoencoder module with fusion attention mechanism can extract the potential representation of features and capture the dependencies between sequences and structures by k-mer method. Finally, the relationship between lncRNA and protein is learned through the method of fuzzy decision. The experimental results show that the ACC of RLF-LPI is 0.912 on ATH948 dataset and 0.921 on ZEA22133 dataset. Thus, it is demonstrated that our proposed method performed better in predicting lncRNA-protein interaction than other methods.
- Published
- 2022
- Full Text
- View/download PDF
6. Predicting lncRNA-protein interactions with bipartite graph embedding and deep graph neural networks
- Author
-
Yuzhou Ma, Han Zhang, Chen Jin, and Chuanze Kang
- Subjects
lncRNA-protein interaction ,graph neural network ,bipartite graph embedding ,heterogeneous graph ,link prediction ,Genetics ,QH426-470 - Abstract
Background: Long non-coding RNAs (lncRNAs) play crucial roles in numerous biological processes. Investigation of the lncRNA-protein interaction contributes to discovering the undetected molecular functions of lncRNAs. In recent years, increasingly computational approaches have substituted the traditional time-consuming experiments utilized to crack the possible unknown associations. However, significant explorations of the heterogeneity in association prediction between lncRNA and protein are inadequate. It remains challenging to integrate the heterogeneity of lncRNA-protein interactions with graph neural network algorithms.Methods: In this paper, we constructed a deep architecture based on GNN called BiHo-GNN, which is the first to integrate the properties of homogeneous with heterogeneous networks through bipartite graph embedding. Different from previous research, BiHo-GNN can capture the mechanism of molecular association by the data encoder of heterogeneous networks. Meanwhile, we design the process of mutual optimization between homogeneous and heterogeneous networks, which can promote the robustness of BiHo-GNN.Results: We collected four datasets for predicting lncRNA-protein interaction and compared the performance of current prediction models on benchmarking dataset. In comparison with the performance of other models, BiHo-GNN outperforms existing bipartite graph-based methods.Conclusion: Our BiHo-GNN integrates the bipartite graph with homogeneous graph networks. Based on this model structure, the lncRNA-protein interactions and potential associations can be predicted and discovered accurately.
- Published
- 2023
- Full Text
- View/download PDF
7. LPI-EnEDT: an ensemble framework with extra tree and decision tree classifiers for imbalanced lncRNA-protein interaction data classification
- Author
-
Lihong Peng, Ruya Yuan, Ling Shen, Pengfei Gao, and Liqian Zhou
- Subjects
lncRNA-protein interaction ,Ensemble ,Class imbalance ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Analysis ,QA299.6-433 - Abstract
Abstract Background Long noncoding RNAs (lncRNAs) have dense linkages with various biological processes. Identifying interacting lncRNA-protein pairs contributes to understand the functions and mechanisms of lncRNAs. Wet experiments are costly and time-consuming. Most computational methods failed to observe the imbalanced characterize of lncRNA-protein interaction (LPI) data. More importantly, they were measured based on a unique dataset, which produced the prediction bias. Results In this study, we develop an Ensemble framework (LPI-EnEDT) with Extra tree and Decision Tree classifiers to implement imbalanced LPI data classification. First, five LPI datasets are arranged. Second, lncRNAs and proteins are separately characterized based on Pyfeat and BioTriangle and concatenated as a vector to represent each lncRNA-protein pair. Finally, an ensemble framework with Extra tree and decision tree classifiers is developed to classify unlabeled lncRNA-protein pairs. The comparative experiments demonstrate that LPI-EnEDT outperforms four classical LPI prediction methods (LPI-BLS, LPI-CatBoost, LPI-SKF, and PLIPCOM) under cross validations on lncRNAs, proteins, and LPIs. The average AUC values on the five datasets are 0.8480, 0,7078, and 0.9066 under the three cross validations, respectively. The average AUPRs are 0.8175, 0.7265, and 0.8882, respectively. Case analyses suggest that there are underlying associations between HOTTIP and Q9Y6M1, NRON and Q15717. Conclusions Fusing diverse biological features of lncRNAs and proteins and exploiting an ensemble learning model with Extra tree and decision tree classifiers, this work focus on imbalanced LPI data classification as well as interaction information inference for a new lncRNA (or protein).
- Published
- 2021
- Full Text
- View/download PDF
8. LPI-HyADBS: a hybrid framework for lncRNA-protein interaction prediction integrating feature selection and classification
- Author
-
Liqian Zhou, Qi Duan, Xiongfei Tian, He Xu, Jianxin Tang, and Lihong Peng
- Subjects
C-SVM ,Deep neural network ,Ensemble learning ,Feature selection ,lncRNA-protein interaction ,XGBoost ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Long noncoding RNAs (lncRNAs) have dense linkages with a plethora of important cellular activities. lncRNAs exert functions by linking with corresponding RNA-binding proteins. Since experimental techniques to detect lncRNA-protein interactions (LPIs) are laborious and time-consuming, a few computational methods have been reported for LPI prediction. However, computation-based LPI identification methods have the following limitations: (1) Most methods were evaluated on a single dataset, and researchers may thus fail to measure their generalization ability. (2) The majority of methods were validated under cross validation on lncRNA-protein pairs, did not investigate the performance under other cross validations, especially for cross validation on independent lncRNAs and independent proteins. (3) lncRNAs and proteins have abundant biological information, how to select informative features need to further investigate. Results Under a hybrid framework (LPI-HyADBS) integrating feature selection based on AdaBoost, and classification models including deep neural network (DNN), extreme gradient Boost (XGBoost), and SVM with a penalty Coefficient of misclassification (C-SVM), this work focuses on finding new LPIs. First, five datasets are arranged. Each dataset contains lncRNA sequences, protein sequences, and an LPI network. Second, biological features of lncRNAs and proteins are acquired based on Pyfeat. Third, the obtained features of lncRNAs and proteins are selected based on AdaBoost and concatenated to depict each LPI sample. Fourth, DNN, XGBoost, and C-SVM are used to classify lncRNA-protein pairs based on the concatenated features. Finally, a hybrid framework is developed to integrate the classification results from the above three classifiers. LPI-HyADBS is compared to six classical LPI prediction approaches (LPI-SKF, LPI-NRLMF, Capsule-LPI, LPI-CNNCP, LPLNP, and LPBNI) on five datasets under 5-fold cross validations on lncRNAs, proteins, lncRNA-protein pairs, and independent lncRNAs and independent proteins. The results show LPI-HyADBS has the best LPI prediction performance under four different cross validations. In particular, LPI-HyADBS obtains better classification ability than other six approaches under the constructed independent dataset. Case analyses suggest that there is relevance between ZNF667-AS1 and Q15717. Conclusions Integrating feature selection approach based on AdaBoost, three classification techniques including DNN, XGBoost, and C-SVM, this work develops a hybrid framework to identify new linkages between lncRNAs and proteins.
- Published
- 2021
- Full Text
- View/download PDF
9. LPI-deepGBDT: a multiple-layer deep framework based on gradient boosting decision trees for lncRNA–protein interaction identification
- Author
-
Liqian Zhou, Zhao Wang, Xiongfei Tian, and Lihong Peng
- Subjects
lncRNA–protein interaction ,Multiple-layer deep architecture ,Gradient boosting decision tree ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Long noncoding RNAs (lncRNAs) play important roles in various biological and pathological processes. Discovery of lncRNA–protein interactions (LPIs) contributes to understand the biological functions and mechanisms of lncRNAs. Although wet experiments find a few interactions between lncRNAs and proteins, experimental techniques are costly and time-consuming. Therefore, computational methods are increasingly exploited to uncover the possible associations. However, existing computational methods have several limitations. First, majority of them were measured based on one simple dataset, which may result in the prediction bias. Second, few of them are applied to identify relevant data for new lncRNAs (or proteins). Finally, they failed to utilize diverse biological information of lncRNAs and proteins. Results Under the feed-forward deep architecture based on gradient boosting decision trees (LPI-deepGBDT), this work focuses on classify unobserved LPIs. First, three human LPI datasets and two plant LPI datasets are arranged. Second, the biological features of lncRNAs and proteins are extracted by Pyfeat and BioProt, respectively. Thirdly, the features are dimensionally reduced and concatenated as a vector to represent an lncRNA–protein pair. Finally, a deep architecture composed of forward mappings and inverse mappings is developed to predict underlying linkages between lncRNAs and proteins. LPI-deepGBDT is compared with five classical LPI prediction models (LPI-BLS, LPI-CatBoost, PLIPCOM, LPI-SKF, and LPI-HNM) under three cross validations on lncRNAs, proteins, lncRNA–protein pairs, respectively. It obtains the best average AUC and AUPR values under the majority of situations, significantly outperforming other five LPI identification methods. That is, AUCs computed by LPI-deepGBDT are 0.8321, 0.6815, and 0.9073, respectively and AUPRs are 0.8095, 0.6771, and 0.8849, respectively. The results demonstrate the powerful classification ability of LPI-deepGBDT. Case study analyses show that there may be interactions between GAS5 and Q15717, RAB30-AS1 and O00425, and LINC-01572 and P35637. Conclusions Integrating ensemble learning and hierarchical distributed representations and building a multiple-layered deep architecture, this work improves LPI prediction performance as well as effectively probes interaction data for new lncRNAs/proteins.
- Published
- 2021
- Full Text
- View/download PDF
10. Multi-feature Fusion Method Based on Linear Neighborhood Propagation Predict Plant LncRNA–Protein Interactions.
- Author
-
Jia, Lijuan and Luan, Yushi
- Subjects
PLANT propagation ,PLANT molecular biology ,NEIGHBORHOODS ,LINCRNA ,DISEASE resistance of plants ,FUSION reactors - Abstract
Long non-coding RNAs (lncRNAs) have attracted extensive attention due to their important roles in various biological processes, among which lncRNA–protein interaction plays an important regulatory role in plant immunity and life activities. Laboratory methods are time consuming and labor-intensive, so that many computational methods have gradually emerged as auxiliary tools to assist relevant research. However, there are relatively few methods to predict lncRNA–protein interaction of plant. Due to the lack of experimentally verified interactions data, there is an imbalance between known and unknown interaction samples in plant data sets. In this study, a multi-feature fusion method based on linear neighborhood propagation is developed to predict plant unobserved lncRNA–protein interaction pairs through known interaction pairs, called MPLPLNP. The linear neighborhood similarity of the feature space is calculated and the results are predicted by label propagation. Meanwhile, multiple feature training is integrated to better explore the potential interaction information in the data. The experimental results show that the proposed multi-feature fusion method can improve the performance of the model, and is superior to other state-of-the-art approaches. Moreover, the proposed approach has better performance and generalization ability on various plant datasets, which is expected to facilitate the related research of plant molecular biology. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
11. Capsule-LPI: a LncRNA–protein interaction predicting tool based on a capsule network
- Author
-
Ying Li, Hang Sun, Shiyao Feng, Qi Zhang, Siyu Han, and Wei Du
- Subjects
Long noncoding RNA ,lncRNA–protein interaction ,Capsule network ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Long noncoding RNAs (lncRNAs) play important roles in multiple biological processes. Identifying LncRNA–protein interactions (LPIs) is key to understanding lncRNA functions. Although some LPIs computational methods have been developed, the LPIs prediction problem remains challenging. How to integrate multimodal features from more perspectives and build deep learning architectures with better recognition performance have always been the focus of research on LPIs. Results We present a novel multichannel capsule network framework to integrate multimodal features for LPI prediction, Capsule-LPI. Capsule-LPI integrates four groups of multimodal features, including sequence features, motif information, physicochemical properties and secondary structure features. Capsule-LPI is composed of four feature-learning subnetworks and one capsule subnetwork. Through comprehensive experimental comparisons and evaluations, we demonstrate that both multimodal features and the architecture of the multichannel capsule network can significantly improve the performance of LPI prediction. The experimental results show that Capsule-LPI performs better than the existing state-of-the-art tools. The precision of Capsule-LPI is 87.3%, which represents a 1.7% improvement. The F-value of Capsule-LPI is 92.2%, which represents a 1.4% improvement. Conclusions This study provides a novel and feasible LPI prediction tool based on the integration of multimodal features and a capsule network. A webserver ( http://csbg-jlu.site/lpc/predict ) is developed to be convenient for users.
- Published
- 2021
- Full Text
- View/download PDF
12. LPInsider: a webserver for lncRNA–protein interaction extraction from the literature.
- Author
-
Li, Ying, Wei, Lizheng, Wang, Cankun, Zhao, Jianing, Han, Siyu, Zhang, Yu, and Du, Wei
- Subjects
- *
INTERNET servers , *PARTS of speech , *LINCRNA , *TEXT mining , *LOGISTIC regression analysis , *SOURCE code , *MACHINE learning - Abstract
Background: Long non-coding RNA (LncRNA) plays important roles in physiological and pathological processes. Identifying LncRNA–protein interactions (LPIs) is essential to understand the molecular mechanism and infer the functions of lncRNAs. With the overwhelming size of the biomedical literature, extracting LPIs directly from the biomedical literature is essential, promising and challenging. However, there is no webserver of LPIs relationship extraction from literature. Results: LPInsider is developed as the first webserver for extracting LPIs from biomedical literature texts based on multiple text features (semantic word vectors, syntactic structure vectors, distance vectors, and part of speech vectors) and logistic regression. LPInsider allows researchers to extract LPIs by uploading PMID, PMCID, PMID List, or biomedical text. A manually filtered and highly reliable LPI corpus is integrated in LPInsider. The performance of LPInsider is optimal by comprehensive experiment on different combinations of different feature and machine learning models. Conclusions: LPInsider is an efficient analytical tool for LPIs that helps researchers to enhance their comprehension of lncRNAs from text mining, and also saving their time. In addition, LPInsider is freely accessible from http://www.csbg-jlu.info/LPInsider/ with no login requirement. The source code and LPIs corpus can be downloaded from https://github.com/qiufengdiewu/LPInsider. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. EnANNDeep: An Ensemble-based lncRNA–protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models.
- Author
-
Peng, Lihong, Tan, Jingwei, Tian, Xiongfei, and Zhou, Liqian
- Subjects
K-nearest neighbor classification ,ARTIFICIAL intelligence ,LINCRNA ,FORECASTING - Abstract
lncRNA–protein interactions (LPIs) prediction can deepen the understanding of many important biological processes. Artificial intelligence methods have reported many possible LPIs. However, most computational techniques were evaluated mainly on one dataset, which may produce prediction bias. More importantly, they were validated only under cross validation on lncRNA–protein pairs, and did not consider the performance under cross validations on lncRNAs and proteins, thus fail to search related proteins/lncRNAs for a new lncRNA/protein. Under an ensemble learning framework (EnANNDeep) composed of adaptive k-nearest neighbor classifier and Deep models, this study focuses on systematically finding underlying linkages between lncRNAs and proteins. First, five LPI-related datasets are arranged. Second, multiple source features are integrated to depict an lncRNA–protein pair. Third, adaptive k-nearest neighbor classifier, deep neural network, and deep forest are designed to score unknown lncRNA–protein pairs, respectively. Finally, interaction probabilities from the three predictors are integrated based on a soft voting technique. In comparing to five classical LPI identification models (SFPEL, PMDKN, CatBoost, PLIPCOM, and LPI-SKF) under fivefold cross validations on lncRNAs, proteins, and LPIs, EnANNDeep computes the best average AUCs of 0.8660, 0.8775, and 0.9166, respectively, and the best average AUPRs of 0.8545, 0.8595, and 0.9054, respectively, indicating its superior LPI prediction ability. Case study analyses indicate that SNHG10 may have dense linkage with Q15717. In the ensemble framework, adaptive k-nearest neighbor classifier can separately pick the most appropriate k for each query lncRNA–protein pair. More importantly, deep models including deep neural network and deep forest can effectively learn the representative features of lncRNAs and proteins. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
14. Predicting lncRNA–Protein Interactions by Heterogenous Network Embedding.
- Author
-
Zhao, Guoqing, Li, Pengpai, Qiao, Xu, Han, Xianhua, and Liu, Zhi-Ping
- Subjects
RECEIVER operating characteristic curves ,FORECASTING - Abstract
lncRNA–protein interactions play essential roles in a variety of cellular processes. However, the experimental methods for systematically mapping of lncRNA–protein interactions remain time-consuming and expensive. Therefore, it is urgent to develop reliable computational methods for predicting lncRNA–protein interactions. In this study, we propose a computational method called LncPNet to predict potential lncRNA–protein interactions by embedding an lncRNA–protein heterogenous network. The experimental results indicate that LncPNet achieves promising performance on benchmark datasets extracted from the NPInter database with an accuracy of 0.930 and area under ROC curve (AUC) of 0.971. In addition, we further compare our method with other eight state-of-the-art methods, and the results illustrate that our method achieves superior prediction performance. LncPNet provides an effective method via a new perspective of representing lncRNA–protein heterogenous network, which will greatly benefit the prediction of lncRNA–protein interactions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. LPI-EnEDT: an ensemble framework with extra tree and decision tree classifiers for imbalanced lncRNA-protein interaction data classification.
- Author
-
Peng, Lihong, Yuan, Ruya, Shen, Ling, Gao, Pengfei, and Zhou, Liqian
- Subjects
- *
DECISION trees , *LINCRNA - Abstract
Background: Long noncoding RNAs (lncRNAs) have dense linkages with various biological processes. Identifying interacting lncRNA-protein pairs contributes to understand the functions and mechanisms of lncRNAs. Wet experiments are costly and time-consuming. Most computational methods failed to observe the imbalanced characterize of lncRNA-protein interaction (LPI) data. More importantly, they were measured based on a unique dataset, which produced the prediction bias. Results: In this study, we develop an Ensemble framework (LPI-EnEDT) with Extra tree and Decision Tree classifiers to implement imbalanced LPI data classification. First, five LPI datasets are arranged. Second, lncRNAs and proteins are separately characterized based on Pyfeat and BioTriangle and concatenated as a vector to represent each lncRNA-protein pair. Finally, an ensemble framework with Extra tree and decision tree classifiers is developed to classify unlabeled lncRNA-protein pairs. The comparative experiments demonstrate that LPI-EnEDT outperforms four classical LPI prediction methods (LPI-BLS, LPI-CatBoost, LPI-SKF, and PLIPCOM) under cross validations on lncRNAs, proteins, and LPIs. The average AUC values on the five datasets are 0.8480, 0,7078, and 0.9066 under the three cross validations, respectively. The average AUPRs are 0.8175, 0.7265, and 0.8882, respectively. Case analyses suggest that there are underlying associations between HOTTIP and Q9Y6M1, NRON and Q15717. Conclusions: Fusing diverse biological features of lncRNAs and proteins and exploiting an ensemble learning model with Extra tree and decision tree classifiers, this work focus on imbalanced LPI data classification as well as interaction information inference for a new lncRNA (or protein). [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. Predicting lncRNA–Protein Interactions by Heterogenous Network Embedding
- Author
-
Guoqing Zhao, Pengpai Li, Xu Qiao, Xianhua Han, and Zhi-Ping Liu
- Subjects
lncRNA–protein interaction ,computational method ,heterogenous network ,network embedding ,LncPNet ,Genetics ,QH426-470 - Abstract
lncRNA–protein interactions play essential roles in a variety of cellular processes. However, the experimental methods for systematically mapping of lncRNA–protein interactions remain time-consuming and expensive. Therefore, it is urgent to develop reliable computational methods for predicting lncRNA–protein interactions. In this study, we propose a computational method called LncPNet to predict potential lncRNA–protein interactions by embedding an lncRNA–protein heterogenous network. The experimental results indicate that LncPNet achieves promising performance on benchmark datasets extracted from the NPInter database with an accuracy of 0.930 and area under ROC curve (AUC) of 0.971. In addition, we further compare our method with other eight state-of-the-art methods, and the results illustrate that our method achieves superior prediction performance. LncPNet provides an effective method via a new perspective of representing lncRNA–protein heterogenous network, which will greatly benefit the prediction of lncRNA–protein interactions.
- Published
- 2022
- Full Text
- View/download PDF
17. LPI-HyADBS: a hybrid framework for lncRNA-protein interaction prediction integrating feature selection and classification.
- Author
-
Zhou, Liqian, Duan, Qi, Tian, Xiongfei, Xu, He, Tang, Jianxin, and Peng, Lihong
- Subjects
- *
LINCRNA , *FEATURE selection , *RNA-binding proteins , *BASE pairs , *AMINO acid sequence - Abstract
Background: Long noncoding RNAs (lncRNAs) have dense linkages with a plethora of important cellular activities. lncRNAs exert functions by linking with corresponding RNA-binding proteins. Since experimental techniques to detect lncRNA-protein interactions (LPIs) are laborious and time-consuming, a few computational methods have been reported for LPI prediction. However, computation-based LPI identification methods have the following limitations: (1) Most methods were evaluated on a single dataset, and researchers may thus fail to measure their generalization ability. (2) The majority of methods were validated under cross validation on lncRNA-protein pairs, did not investigate the performance under other cross validations, especially for cross validation on independent lncRNAs and independent proteins. (3) lncRNAs and proteins have abundant biological information, how to select informative features need to further investigate. Results: Under a hybrid framework (LPI-HyADBS) integrating feature selection based on AdaBoost, and classification models including deep neural network (DNN), extreme gradient Boost (XGBoost), and SVM with a penalty Coefficient of misclassification (C-SVM), this work focuses on finding new LPIs. First, five datasets are arranged. Each dataset contains lncRNA sequences, protein sequences, and an LPI network. Second, biological features of lncRNAs and proteins are acquired based on Pyfeat. Third, the obtained features of lncRNAs and proteins are selected based on AdaBoost and concatenated to depict each LPI sample. Fourth, DNN, XGBoost, and C-SVM are used to classify lncRNA-protein pairs based on the concatenated features. Finally, a hybrid framework is developed to integrate the classification results from the above three classifiers. LPI-HyADBS is compared to six classical LPI prediction approaches (LPI-SKF, LPI-NRLMF, Capsule-LPI, LPI-CNNCP, LPLNP, and LPBNI) on five datasets under 5-fold cross validations on lncRNAs, proteins, lncRNA-protein pairs, and independent lncRNAs and independent proteins. The results show LPI-HyADBS has the best LPI prediction performance under four different cross validations. In particular, LPI-HyADBS obtains better classification ability than other six approaches under the constructed independent dataset. Case analyses suggest that there is relevance between ZNF667-AS1 and Q15717. Conclusions: Integrating feature selection approach based on AdaBoost, three classification techniques including DNN, XGBoost, and C-SVM, this work develops a hybrid framework to identify new linkages between lncRNAs and proteins. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. LPI-deepGBDT: a multiple-layer deep framework based on gradient boosting decision trees for lncRNA–protein interaction identification.
- Author
-
Zhou, Liqian, Wang, Zhao, Tian, Xiongfei, and Peng, Lihong
- Subjects
- *
LINCRNA , *DECISION trees , *BOOSTING algorithms , *PREDICTION models - Abstract
Background: Long noncoding RNAs (lncRNAs) play important roles in various biological and pathological processes. Discovery of lncRNA–protein interactions (LPIs) contributes to understand the biological functions and mechanisms of lncRNAs. Although wet experiments find a few interactions between lncRNAs and proteins, experimental techniques are costly and time-consuming. Therefore, computational methods are increasingly exploited to uncover the possible associations. However, existing computational methods have several limitations. First, majority of them were measured based on one simple dataset, which may result in the prediction bias. Second, few of them are applied to identify relevant data for new lncRNAs (or proteins). Finally, they failed to utilize diverse biological information of lncRNAs and proteins. Results: Under the feed-forward deep architecture based on gradient boosting decision trees (LPI-deepGBDT), this work focuses on classify unobserved LPIs. First, three human LPI datasets and two plant LPI datasets are arranged. Second, the biological features of lncRNAs and proteins are extracted by Pyfeat and BioProt, respectively. Thirdly, the features are dimensionally reduced and concatenated as a vector to represent an lncRNA–protein pair. Finally, a deep architecture composed of forward mappings and inverse mappings is developed to predict underlying linkages between lncRNAs and proteins. LPI-deepGBDT is compared with five classical LPI prediction models (LPI-BLS, LPI-CatBoost, PLIPCOM, LPI-SKF, and LPI-HNM) under three cross validations on lncRNAs, proteins, lncRNA–protein pairs, respectively. It obtains the best average AUC and AUPR values under the majority of situations, significantly outperforming other five LPI identification methods. That is, AUCs computed by LPI-deepGBDT are 0.8321, 0.6815, and 0.9073, respectively and AUPRs are 0.8095, 0.6771, and 0.8849, respectively. The results demonstrate the powerful classification ability of LPI-deepGBDT. Case study analyses show that there may be interactions between GAS5 and Q15717, RAB30-AS1 and O00425, and LINC-01572 and P35637. Conclusions: Integrating ensemble learning and hierarchical distributed representations and building a multiple-layered deep architecture, this work improves LPI prediction performance as well as effectively probes interaction data for new lncRNAs/proteins. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. Predicting lncRNA–Protein Interaction With Weighted Graph-Regularized Matrix Factorization
- Author
-
Xibo Sun, Leiming Cheng, Jinyang Liu, Cuinan Xie, Jiasheng Yang, and Fu Li
- Subjects
lncRNA–protein interaction ,weighted graph-regularized matrix factorization ,lncRNA similarity ,protein similarity ,SFPQ ,SNHG3 ,Genetics ,QH426-470 - Abstract
Long non-coding RNAs (lncRNAs) are widely concerned because of their close associations with many key biological activities. Though precise functions of most lncRNAs are unknown, research works show that lncRNAs usually exert biological function by interacting with the corresponding proteins. The experimental validation of interactions between lncRNAs and proteins is costly and time-consuming. In this study, we developed a weighted graph-regularized matrix factorization (LPI-WGRMF) method to find unobserved lncRNA–protein interactions (LPIs) based on lncRNA similarity matrix, protein similarity matrix, and known LPIs. We compared our proposed LPI-WGRMF method with five classical LPI prediction methods, that is, LPBNI, LPI-IBNRA, LPIHN, RWR, and collaborative filtering (CF). The results demonstrate that the LPI-WGRMF method can produce high-accuracy performance, obtaining an AUC score of 0.9012 and AUPR of 0.7324. The case study showed that SFPQ, SNHG3, and PRPF31 may associate with Q9NUL5, Q9NUL5, and Q9UKV8 with the highest linking probabilities and need to further experimental validation.
- Published
- 2021
- Full Text
- View/download PDF
20. Predicting lncRNA–Protein Interaction With Weighted Graph-Regularized Matrix Factorization.
- Author
-
Sun, Xibo, Cheng, Leiming, Liu, Jinyang, Xie, Cuinan, Yang, Jiasheng, and Li, Fu
- Subjects
MATRIX decomposition ,LINCRNA ,EXTRACELLULAR matrix proteins - Abstract
Long non-coding RNAs (lncRNAs) are widely concerned because of their close associations with many key biological activities. Though precise functions of most lncRNAs are unknown, research works show that lncRNAs usually exert biological function by interacting with the corresponding proteins. The experimental validation of interactions between lncRNAs and proteins is costly and time-consuming. In this study, we developed a weighted graph-regularized matrix factorization (LPI-WGRMF) method to find unobserved lncRNA–protein interactions (LPIs) based on lncRNA similarity matrix, protein similarity matrix, and known LPIs. We compared our proposed LPI-WGRMF method with five classical LPI prediction methods, that is, LPBNI, LPI-IBNRA, LPIHN, RWR, and collaborative filtering (CF). The results demonstrate that the LPI-WGRMF method can produce high-accuracy performance, obtaining an AUC score of 0.9012 and AUPR of 0.7324. The case study showed that SFPQ, SNHG3, and PRPF31 may associate with Q9NUL5, Q9NUL5, and Q9UKV8 with the highest linking probabilities and need to further experimental validation. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
21. Prediction of plant LncRNA-protein interactions based on feature fusion and an improved residual network.
- Author
-
Zhang, Lina, Yang, Runtao, Xia, Defei, Lin, Xiaorui, and Xiong, Wanying
- Subjects
- *
CHROMOSOME replication , *CELLULAR signal transduction , *LINCRNA , *CHROMOSOME structure , *DEEP learning , *FUSION reactors , *PROTEIN-protein interactions - Abstract
LncRNA(long non-coding RNA)-protein interaction (LPI) has effects on chromosome structure and gene transcription, participating in key cellular processes such as signal transduction, chromosome replication, material transport, and mitosis. Accurate identification of LPIs will provide scientific basis for understanding the molecular mechanisms of LncRNA-related diseases, thereby promoting the progress of disease diagnosis technologies and the development of therapeutic procedures to a certain extent. How to comprehensively mine feature information reflecting functional attributes from LncRNAs and proteins, apply deep learning to extract advanced features from the original input features of proteins and LncRNAs, and effectively fuse them have been important research challenges in this field. Aiming at the limitations of existing methods, based on deep learning techniques such as bi-directional long short-term memory (BiLSTM), attention mechanism, and an improved residual network, an LPI prediction model called LPI-LSTM-ResNet is constructed in this paper. Firstly, the sequence and structural information of LncRNAs and proteins are extracted from different perspectives. Then, the deep interaction fused features between LncRNAs and proteins are obtained by BiLSTM and attention mechanism. Finally, the fused features are input into an improved residual network with LSTM as the residual element to preserve long-distance dependencies between sequences. The 5-fold cross-validation results indicate that the feature combination strategy, feature fusion strategy, and the improved residual network consistently improve the LPI prediction performance. Compared with existing methods on the same plant datasets, LPI-LSTM-ResNet exhibits superior performance. • Sequence and structural information of LncRNAs and proteins are extracted. • Deep interaction fused features are obtained by BiLSTM and attention mechanism. • An improved residual network is proposed to preserve long-distance dependencies. • Compared with existing methods, LPI-LSTM-ResNet exhibits superior performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. LPI-KTASLP: Prediction of LncRNA-Protein Interaction by Semi-Supervised Link Learning With Multivariate Information
- Author
-
Cong Shen, Yijie Ding, Jijun Tang, Limin Jiang, and Fei Guo
- Subjects
LncRNA-protein interaction ,kernel target alignment ,low-rank approximation ,multiple kernel learning ,semi-supervised link prediction ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Long non-coding RNA, also known as lncRNA, is a series of single-stranded polynucleotides (no less than 200 nucleotides each), consisting of non-protein coding transcripts. LncRNA plays a crucial role in regulating gene expression, during the transcriptional, post-transcriptional, and epigenetic processes. This is achieved by lncRNA interacts with the corresponding RNA-binding proteins. It has been drawn to a lot of attention that the reduction of the excessive laboratory cost and the increase in speed and accuracy gains benefits from the employment of computational intelligence in lncRNA-protein interaction (LPI) identification. Although numerous pertinent in silico studies of LPI prediction have been proposed, there is still room for enhancing the accuracy of the existing LPI prediction methods. In this paper, we have proposed a novel method for identifying LPI with kernel target alignment based on semi-supervised link prediction (LPI-KTASLP), which adopts multivariate information to predict lncRNAs-proteins interactions. To integrate the heterogeneous kernels, kernel target alignment has been applied to deal with kernel fusion. We have calculated the low-rank approximation matrices of lncRNA and protein, where eigendecomposition is used to reduce computing pressure. The prediction model has been obtained by producing the ultimate LPI prediction matrix. Experimental results show that the prediction ability of the LPI-KTASLP algorithm has surpassed many other LPI prediction schemes. Our method of lncRNA-protein interaction prediction has been evaluated on a standard benchmark dataset of LPIs. We have observed that the highest AUPR of 0.6148 is obtained by our proposed model (LPI-KTASLP). This is superior to the integrated LPLNP (AUPR: 0.4584), the RWR (AUPR: 0.2827), the CF (AUPR: 0.2357), the LPIHN (AUPR: 0.2299), and the LPBNI (AUPR: 0.3302). It is very encouraging that most of the LPI predictions have been confirmed to be close to relevant concentrations.
- Published
- 2019
- Full Text
- View/download PDF
23. ACCBN: ant-Colony-clustering-based bipartite network method for predicting long non-coding RNA–protein interactions
- Author
-
Rong Zhu, Guangshun Li, Jin-Xing Liu, Ling-Yun Dai, and Ying Guo
- Subjects
LncRNA–protein interaction ,Ant colony clustering ,Bipartite network ,Predicting ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Long non-coding RNA (lncRNA) studies play an important role in the development, invasion, and metastasis of the tumor. The analysis and screening of the differential expression of lncRNAs in cancer and corresponding paracancerous tissues provides new clues for finding new cancer diagnostic indicators and improving the treatment. Predicting lncRNA–protein interactions is very important in the analysis of lncRNAs. This article proposes an Ant-Colony-Clustering-Based Bipartite Network (ACCBN) method and predicts lncRNA–protein interactions. The ACCBN method combines ant colony clustering and bipartite network inference to predict lncRNA–protein interactions. Results A five-fold cross-validation method was used in the experimental test. The results show that the values of the evaluation indicators of ACCBN on the test set are significantly better after comparing the predictive ability of ACCBN with RWR, ProCF, LPIHN, and LPBNI method. Conclusions With the continuous development of biology, besides the research on the cellular process, the research on the interaction function between proteins becomes a new key topic of biology. The studies on protein-protein interactions had important implications for bioinformatics, clinical medicine, and pharmacology. However, there are many kinds of proteins, and their functions of interactions are complicated. Moreover, the experimental methods require time to be confirmed because it is difficult to estimate. Therefore, a viable solution is to predict protein-protein interactions efficiently with computers. The ACCBN method has a good effect on the prediction of protein-protein interactions in terms of sensitivity, precision, accuracy, and F1-score.
- Published
- 2019
- Full Text
- View/download PDF
24. Capsule-LPI: a LncRNA–protein interaction predicting tool based on a capsule network.
- Author
-
Li, Ying, Sun, Hang, Feng, Shiyao, Zhang, Qi, Han, Siyu, and Du, Wei
- Subjects
- *
LINCRNA , *MULTIMODAL user interfaces , *DEEP learning , *OSCILLATOR strengths - Abstract
Background: Long noncoding RNAs (lncRNAs) play important roles in multiple biological processes. Identifying LncRNA–protein interactions (LPIs) is key to understanding lncRNA functions. Although some LPIs computational methods have been developed, the LPIs prediction problem remains challenging. How to integrate multimodal features from more perspectives and build deep learning architectures with better recognition performance have always been the focus of research on LPIs. Results: We present a novel multichannel capsule network framework to integrate multimodal features for LPI prediction, Capsule-LPI. Capsule-LPI integrates four groups of multimodal features, including sequence features, motif information, physicochemical properties and secondary structure features. Capsule-LPI is composed of four feature-learning subnetworks and one capsule subnetwork. Through comprehensive experimental comparisons and evaluations, we demonstrate that both multimodal features and the architecture of the multichannel capsule network can significantly improve the performance of LPI prediction. The experimental results show that Capsule-LPI performs better than the existing state-of-the-art tools. The precision of Capsule-LPI is 87.3%, which represents a 1.7% improvement. The F-value of Capsule-LPI is 92.2%, which represents a 1.4% improvement. Conclusions: This study provides a novel and feasible LPI prediction tool based on the integration of multimodal features and a capsule network. A webserver (http://csbg-jlu.site/lpc/predict) is developed to be convenient for users. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
25. Probing lncRNA–Protein Interactions: Data Repositories, Models, and Algorithms
- Author
-
Lihong Peng, Fuxing Liu, Jialiang Yang, Xiaojun Liu, Yajie Meng, Xiaojun Deng, Cheng Peng, Geng Tian, and Liqian Zhou
- Subjects
lncRNA–protein interaction ,computational method ,network-based method ,machine learning-based method ,data repositories ,Genetics ,QH426-470 - Abstract
Identifying lncRNA–protein interactions (LPIs) is vital to understanding various key biological processes. Wet experiments found a few LPIs, but experimental methods are costly and time-consuming. Therefore, computational methods are increasingly exploited to capture LPI candidates. We introduced relevant data repositories, focused on two types of LPI prediction models: network-based methods and machine learning-based methods. Machine learning-based methods contain matrix factorization-based techniques and ensemble learning-based techniques. To detect the performance of computational methods, we compared parts of LPI prediction models on Leave-One-Out cross-validation (LOOCV) and fivefold cross-validation. The results show that SFPEL-LPI obtained the best performance of AUC. Although computational models have efficiently unraveled some LPI candidates, there are many limitations involved. We discussed future directions to further boost LPI predictive performance.
- Published
- 2020
- Full Text
- View/download PDF
26. Predicting lncRNA–Protein Interactions With miRNAs as Mediators in a Heterogeneous Network Model
- Author
-
Yuan-Ke Zhou, Zi-Ang Shen, Han Yu, Tao Luo, Yang Gao, and Pu-Feng Du
- Subjects
heterogeneous network ,lncRNA–protein interaction ,lncRNA–miRNA interaction ,miRNA–protein interaction ,network similarity ,Genetics ,QH426-470 - Abstract
Long non-coding RNAs (lncRNAs) play important roles in various biological processes, where lncRNA–protein interactions are usually involved. Therefore, identifying lncRNA–protein interactions is of great significance to understand the molecular functions of lncRNAs. Since the experiments to identify lncRNA–protein interactions are always costly and time consuming, computational methods are developed as alternative approaches. However, existing lncRNA–protein interaction predictors usually require prior knowledge of lncRNA–protein interactions with experimental evidences. Their performances are limited due to the number of known lncRNA–protein interactions. In this paper, we explored a novel way to predict lncRNA–protein interactions without direct prior knowledge. MiRNAs were picked up as mediators to estimate potential interactions between lncRNAs and proteins. By validating our results based on known lncRNA–protein interactions, our method achieved an AUROC (Area Under Receiver Operating Curve) of 0.821, which is comparable to the state-of-the-art methods. Moreover, our method achieved an improved AUROC of 0.852 by further expanding the training dataset. We believe that our method can be a useful supplement to the existing methods, as it provides an alternative way to estimate lncRNA–protein interactions in a heterogeneous network without direct prior knowledge. All data and codes of this work can be downloaded from GitHub (https://github.com/zyk2118216069/LncRNA-protein-interactions-prediction).
- Published
- 2020
- Full Text
- View/download PDF
27. Multi-feature fusion for deep learning to predict plant lncRNA-protein interaction.
- Author
-
Wekesa, Jael Sanyanda, Meng, Jun, and Luan, Yushi
- Subjects
- *
DEEP learning , *RNA-binding proteins , *LINCRNA , *USEFUL plants , *CORN , *NON-coding RNA , *FUSION reactors - Abstract
Long non-coding RNAs (lncRNAs) play key roles in regulating cellular biological processes through diverse molecular mechanisms including binding to RNA binding proteins. The majority of plant lncRNAs are functionally uncharacterized, thus, accurate prediction of plant lncRNA–protein interaction is imperative for subsequent functional studies. We present an integrative model, namely DRPLPI. Its uniqueness is that it predicts by multi-feature fusion. Structural and four groups of sequence features are used, including tri-nucleotide composition, gapped k -mer, recursive complement and binary profile. We design a multi-head self-attention long short-term memory encoder-decoder network to extract generative high-level features. To obtain robust results, DRPLPI combines categorical boosting and extra trees into a single meta-learner. Experiments on Zea mays and Arabidopsis thaliana obtained 0.9820 and 0.9652 area under precision/recall curve (AUPRC) respectively. The proposed method shows significant enhancement in the prediction performance compared with existing state-of-the-art methods. • We propose a computational predictor called DRPLPI for plant lncRNA–protein interaction prediction. • DRPLPI integrates highly informative multidimensional representation of the biological properties of the corresponding lncRNAs and proteins. • The method enhances predictive power of lncRNA–protein interaction and is useful for plant transcriptomic research. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
28. Probing lncRNA–Protein Interactions: Data Repositories, Models, and Algorithms.
- Author
-
Peng, Lihong, Liu, Fuxing, Yang, Jialiang, Liu, Xiaojun, Meng, Yajie, Deng, Xiaojun, Peng, Cheng, Tian, Geng, and Zhou, Liqian
- Subjects
INSTITUTIONAL repositories ,PREDICTION models ,SUPPORT vector machines ,ALGORITHMS ,FACTORIZATION - Abstract
Identifying lncRNA–protein interactions (LPIs) is vital to understanding various key biological processes. Wet experiments found a few LPIs, but experimental methods are costly and time-consuming. Therefore, computational methods are increasingly exploited to capture LPI candidates. We introduced relevant data repositories, focused on two types of LPI prediction models: network-based methods and machine learning-based methods. Machine learning-based methods contain matrix factorization-based techniques and ensemble learning-based techniques. To detect the performance of computational methods, we compared parts of LPI prediction models on Leave-One-Out cross-validation (LOOCV) and fivefold cross-validation. The results show that SFPEL-LPI obtained the best performance of AUC. Although computational models have efficiently unraveled some LPI candidates, there are many limitations involved. We discussed future directions to further boost LPI predictive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
29. Predicting lncRNA–Protein Interactions With miRNAs as Mediators in a Heterogeneous Network Model.
- Author
-
Zhou, Yuan-Ke, Shen, Zi-Ang, Yu, Han, Luo, Tao, Gao, Yang, and Du, Pu-Feng
- Subjects
MICRORNA ,NON-coding RNA ,PRIOR learning - Abstract
Long non-coding RNAs (lncRNAs) play important roles in various biological processes, where lncRNA–protein interactions are usually involved. Therefore, identifying lncRNA–protein interactions is of great significance to understand the molecular functions of lncRNAs. Since the experiments to identify lncRNA–protein interactions are always costly and time consuming, computational methods are developed as alternative approaches. However, existing lncRNA–protein interaction predictors usually require prior knowledge of lncRNA–protein interactions with experimental evidences. Their performances are limited due to the number of known lncRNA–protein interactions. In this paper, we explored a novel way to predict lncRNA–protein interactions without direct prior knowledge. MiRNAs were picked up as mediators to estimate potential interactions between lncRNAs and proteins. By validating our results based on known lncRNA–protein interactions, our method achieved an AUROC (Area Under Receiver Operating Curve) of 0.821, which is comparable to the state-of-the-art methods. Moreover, our method achieved an improved AUROC of 0.852 by further expanding the training dataset. We believe that our method can be a useful supplement to the existing methods, as it provides an alternative way to estimate lncRNA–protein interactions in a heterogeneous network without direct prior knowledge. All data and codes of this work can be downloaded from GitHub (https://github.com/zyk2118216069/LncRNA-protein-interactions-prediction). [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
30. LPGNMF: Predicting Long Non-Coding RNA and Protein Interaction Using Graph Regularized Nonnegative Matrix Factorization.
- Author
-
Zhang, Tianyi, Wang, Minghui, Xi, Jianing, and Li, Ao
- Abstract
Long non-coding RNAs (lncRNA) play crucial roles in a variety of biological processes and complex diseases. Massive studies have indicated that lncRNAs interact with related proteins to exert regulation of cellular biological processes. Because it is time-consuming and expensive to determine lncRNA-protein interaction by experiment, more accurate predictions of interaction by computational methods are imperative. We propose a novel computational approach, predicting lncRNA-protein interaction using graph regularized nonnegative matrix factorization (LPGNMF), to discover unobserved lncRNA-protein association. First, we calculate lncRNA similarity and protein similarity by integrating the lncRNA expression information and gene ontology information. Subsequently, we utilize graph regularized nonnegative matrix factorization framework to predict potential interactions for all lncRNA simultaneously. In the cross validation test, LPGNMF achieves an AUC of 85.2 percent, higher than those of other compared methods. In addition, novel lncRNA-protein interactions detected by LPGNMF are validated by literatures or database. The results indicate that our method is effective to discover potential lncRNA-protein interaction. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. LPI-BLS: Predicting lncRNA–protein interactions with a broad learning system-based stacked ensemble classifier.
- Author
-
Fan, Xiao-Nan and Zhang, Shao-Wu
- Subjects
- *
RNA-binding proteins , *NON-coding RNA , *SOURCE code , *INSTRUCTIONAL systems , *DEEP learning , *DNA-binding proteins - Abstract
Many experiment results show that long non-coding RNAs (lncRNAs) play crucial roles in many biological processes, implementing their functions through interaction with RNA-binding proteins (RBPs). Considering that the experimental identification of lncRNA–protein interactions is expensive and time-consuming, many computational methods are proposed to uncover the potential lncRNA–protein interactions. In this study, we develop a novel computational method (namely LPI-BLS) to predict the lncRNA–protein interactions by using the broad learning system and building a stacked ensemble classifier with a logistical regression model. LPI-BLS first adopts the broad learning system to predict the lncRNA–protein interactions. Broad learning system is an alternative way of learning in deep structure and a flat network with few parameters. Then, the results of multiple individual broad learning systems are fed into the stacked ensemble classifier built with a logistical regression to further improve the predictive performance. Compared with other state-of-the-art methods in 5-fold cross-validation test, LPI-BLS has the best performance with the accuracy of 0.902 on RPI488 dataset, the average accuracy of 0.927 on RPI7317 dataset. The results in the independent test also show that our LPI-BLS can effectively predict the lncRNA–protein interactions. The source code can be freely downloaded from https://github.com/NWPU-903PR/LPI_BLS. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
32. Fusing multiple protein-protein similarity networks to effectively predict lncRNA-protein interactions
- Author
-
Xiaoxiong Zheng, Yang Wang, Kai Tian, Jiaogen Zhou, Jihong Guan, Libo Luo, and Shuigeng Zhou
- Subjects
lncRNA-Protein Interaction ,Random walk ,Similarity network fusion ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Long non-coding RNA (lncRNA) plays important roles in many biological and pathological processes, including transcriptional regulation and gene regulation. As lncRNA interacts with multiple proteins, predicting lncRNA-protein interactions (lncRPIs) is an important way to study the functions of lncRNA. Up to now, there have been a few works that exploit protein-protein interactions (PPIs) to help the prediction of new lncRPIs. Results In this paper, we propose to boost the prediction of lncRPIs by fusing multiple protein-protein similarity networks (PPSNs). Concretely, we first construct four PPSNs based on protein sequences, protein domains, protein GO terms and the STRING database respectively, then build a more informative PPSN by fusing these four constructed PPSNs. Finally, we predict new lncRPIs by a random walk method with the fused PPSN and known lncRPIs. Our experimental results show that the new approach outperforms the existing methods. Conclusion Fusing multiple protein-protein similarity networks can effectively boost the performance of predicting lncRPIs.
- Published
- 2017
- Full Text
- View/download PDF
33. Projection-Based Neighborhood Non-Negative Matrix Factorization for lncRNA-Protein Interaction Prediction
- Author
-
Yingjun Ma, Tingting He, and Xingpeng Jiang
- Subjects
lncRNA-protein interaction ,feature projection ,neighborhood completion ,graph non-negative matrix factorization ,kernel neighborhood similarity ,Genetics ,QH426-470 - Abstract
Many long ncRNAs (lncRNA) make their effort by interacting with the corresponding RNA-binding proteins, and identifying the interactions between lncRNAs and proteins is important to understand the functions of lncRNA. Compared with the time-consuming and laborious experimental methods, more and more computational models are proposed to predict lncRNA-protein interactions. However, few models can effectively utilize the biological network topology of lncRNA (protein) and combine its sequence structure features, and most models cannot effectively predict new proteins (lncRNA) that do not interact with any lncRNA (proteins). In this study, we proposed a projection-based neighborhood non-negative matrix decomposition model (PMKDN) to predict potential lncRNA-protein interactions by integrating multiple biological features of lncRNAs (proteins). First, according to lncRNA (protein) sequences and lncRNA expression profile data, we extracted multiple features of lncRNA (protein). Second, based on protein GO ontology annotation, lncRNA sequences, lncRNA(protein) feature information, and modified lncRNA-protein interaction network, we calculated multiple similarities of lncRNA (protein), and fused them to obtain a more accurate lncRNA(protein) similarity network. Finally, combining the similarity and various feature information of lncRNA (protein), as well as the modified interaction network, we proposed a projection-based neighborhood non-negative matrix decomposition algorithm to predict the potential lncRNA-protein interactions. On two benchmark datasets, PMKDN showed better performance than other state-of-the-art methods for the prediction of new lncRNA-protein interactions, new lncRNAs, and new proteins. Case study further indicates that PMKDN can be used as an effective tool for lncRNA-protein interaction prediction.
- Published
- 2019
- Full Text
- View/download PDF
34. Projection-Based Neighborhood Non-Negative Matrix Factorization for lncRNA-Protein Interaction Prediction.
- Author
-
Ma, Yingjun, He, Tingting, and Jiang, Xingpeng
- Subjects
NONNEGATIVE matrices ,MATRIX decomposition ,RNA-binding proteins ,LINCRNA ,BIOLOGICAL networks ,DNA-binding proteins - Abstract
Many long ncRNAs (lncRNA) make their effort by interacting with the corresponding RNA-binding proteins, and identifying the interactions between lncRNAs and proteins is important to understand the functions of lncRNA. Compared with the time-consuming and laborious experimental methods, more and more computational models are proposed to predict lncRNA-protein interactions. However, few models can effectively utilize the biological network topology of lncRNA (protein) and combine its sequence structure features, and most models cannot effectively predict new proteins (lncRNA) that do not interact with any lncRNA (proteins). In this study, we proposed a projection-based neighborhood non-negative matrix decomposition model (PMKDN) to predict potential lncRNA-protein interactions by integrating multiple biological features of lncRNAs (proteins). First, according to lncRNA (protein) sequences and lncRNA expression profile data, we extracted multiple features of lncRNA (protein). Second, based on protein GO ontology annotation, lncRNA sequences, lncRNA(protein) feature information, and modified lncRNA-protein interaction network, we calculated multiple similarities of lncRNA (protein), and fused them to obtain a more accurate lncRNA(protein) similarity network. Finally, combining the similarity and various feature information of lncRNA (protein), as well as the modified interaction network, we proposed a projection-based neighborhood non-negative matrix decomposition algorithm to predict the potential lncRNA-protein interactions. On two benchmark datasets, PMKDN showed better performance than other state-of-the-art methods for the prediction of new lncRNA-protein interactions, new lncRNAs, and new proteins. Case study further indicates that PMKDN can be used as an effective tool for lncRNA-protein interaction prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
35. ACCBN: ant-Colony-clustering-based bipartite network method for predicting long non-coding RNA–protein interactions.
- Author
-
Zhu, Rong, Li, Guangshun, Liu, Jin-Xing, Dai, Ling-Yun, and Guo, Ying
- Subjects
- *
BIPARTITE graphs , *RNA , *PROTEIN-protein interactions , *METASTASIS , *CANCER cells , *NUCLEOTIDES - Abstract
Background: Long non-coding RNA (lncRNA) studies play an important role in the development, invasion, and metastasis of the tumor. The analysis and screening of the differential expression of lncRNAs in cancer and corresponding paracancerous tissues provides new clues for finding new cancer diagnostic indicators and improving the treatment. Predicting lncRNA–protein interactions is very important in the analysis of lncRNAs. This article proposes an Ant-Colony-Clustering-Based Bipartite Network (ACCBN) method and predicts lncRNA–protein interactions. The ACCBN method combines ant colony clustering and bipartite network inference to predict lncRNA–protein interactions. Results: A five-fold cross-validation method was used in the experimental test. The results show that the values of the evaluation indicators of ACCBN on the test set are significantly better after comparing the predictive ability of ACCBN with RWR, ProCF, LPIHN, and LPBNI method. Conclusions: With the continuous development of biology, besides the research on the cellular process, the research on the interaction function between proteins becomes a new key topic of biology. The studies on protein-protein interactions had important implications for bioinformatics, clinical medicine, and pharmacology. However, there are many kinds of proteins, and their functions of interactions are complicated. Moreover, the experimental methods require time to be confirmed because it is difficult to estimate. Therefore, a viable solution is to predict protein-protein interactions efficiently with computers. The ACCBN method has a good effect on the prediction of protein-protein interactions in terms of sensitivity, precision, accuracy, and F1-score. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
36. RPIPCM: A deep network model for predicting lncRNA-protein interaction based on sequence feature encoding.
- Author
-
Gong L, Chen J, Cui X, and Liu Y
- Subjects
- Animals, Computational Biology methods, RNA, Long Noncoding genetics
- Abstract
LncRNA-protein interactionplays an important regulatory role in biological processes. In this paper, the proposed RPIPCM based on a novel deep network model uses the sequence feature encoding of both RNA and protein to predict lncRNA-protein interactions (LPIs). A negative sampling of sliding window method is proposed for solving the problem of unbalanced between positive and negative samples. The proposed negative sampling method is effective and helpful to solve the problem of data imbalance in the existing LPIs research by comparative experiments. Experimental results also show that the proposed sequence feature encoding method has good performance in predicting LPIs for different datasets of different sizes and types. In the RPI488 dataset related to animal, compared with the direct original sequence encoding model, the accuracy of sequence feature encoding model increased by 1.02%, the recall increased by 4.08%, and the value of MCC increased by 1.67%. In the case of the plant dataset ATH948, the sequence feature-based encoding demonstrated a 1.58% higher accuracy, a 1.53% higher recall, a 1.62% higher specificity, a 1.62% higher precision, and a 3.16% higher value of MCC compared to the direct original sequence-based encoding. Compared with the latest prediction work in the ZEA22133 dataset, RPIPCM is shown to be more effective with the accuracy increased by 2.23%, the recall increased by 1.78%, the specificity increased by 2.67%, the precision increased by 2.52%, and the value of MCC increased by 4.43%, which also proves the effectiveness and robustness of RPIPCM. In conclusion, RPIPCM of deep network model based on sequence feature encoding can automatically mine the hidden feature information of the sequence in the lncRNA-protein interaction without relying on external features or prior biomedical knowledge, and its low cost and high efficiency can provide a reference for biomedical researchers., Competing Interests: Declaration of competing interest We declare that we have no financial and personal relationships with other people or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or company that could be construed as influencing the position presented in the manuscript entitled “RPIPCM: A deep network model for predicting lncRNA-protein interaction based on sequence feature encoding”., (Copyright © 2023 Elsevier Ltd. All rights reserved.)
- Published
- 2023
- Full Text
- View/download PDF
37. HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy.
- Author
-
Hu, Huan, Zhang, Li, Ai, Haixin, Zhang, Hui, Fan, Yetian, Zhao, Qi, and Liu, Hongsheng
- Abstract
LncRNA plays an important role in many biological and disease progression by binding to related proteins. However, the experimental methods for studying lncRNA-protein interactions are time-consuming and expensive. Although there are a few models designed to predict the interactions of ncRNA-protein, they all have some common drawbacks that limit their predictive performance. In this study, we present a model called HLPI-Ensemble designed specifically for human lncRNA-protein interactions. HLPI-Ensemble adopts the ensemble strategy based on three mainstream machine learning algorithms of Support Vector Machines (SVM), Random Forests (RF) and Extreme Gradient Boosting (XGB) to generate HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble, respectively. The results of 10-fold cross-validation show that HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble achieved AUCs of 0.95, 0.96 and 0.96, respectively, in the test dataset. Furthermore, we compared the performance of the HLPI-Ensemble models with the previous models through external validation dataset. The results show that the false positives (FPs) of HLPI-Ensemble models are much lower than that of the previous models, and other evaluation indicators of HLPI-Ensemble models are also higher than those of the previous models. It is further showed that HLPI-Ensemble models are superior in predicting human lncRNA-protein interaction compared with previous models. The HLPI-Ensemble is publicly available at:
http://ccsipb.lnu.edu.cn/hlpiensemble/ . [ABSTRACT FROM AUTHOR]- Published
- 2018
- Full Text
- View/download PDF
38. The linear neighborhood propagation method for predicting long non-coding RNA–protein interactions.
- Author
-
Zhang, Wen, Qu, Qianlong, Zhang, Yunqiu, and Wang, Wei
- Subjects
- *
RNA-protein interactions , *NON-coding RNA , *LINEAR programming , *BIOMATHEMATICS , *SOURCE code - Abstract
Long non-coding RNAs (lncRNAs) have gained wide attentions because of their essential functions in a variety of biological processes. Though precise functions and mechanisms of most lncRNAs remain unknown, studies show that lncRNAs generally exert functions through interactions with the corresponding RNA-binding proteins. The experimental detection of lncRNA–protein interactions is costly and time-consuming. In this paper, we propose a linear neighborhood propagation method (LPLNP), to predict lncRNA–protein interactions. LPLNP calculates the linear neighborhood similarity in the feature space, and transfers it into the interaction space, and predict unobserved interactions between the lncRNAs and proteins by a label propagation process. Our study shows that the LPLNP model based on the known lncRNA–protein interactions can produce high-accuracy performances, achieving an AUPR score of 0.42. Furthermore, we incorporate biological information of lncRNAs and proteins into the LPLNP model, and can further increase the performances, achieving an AUPR score of 0.4584. The case study demonstrates that many lncRNA–protein interactions predicted by our method can be validated, indicating that our method is a useful tool for lncRNA–protein interaction prediction. The source code and the dataset used in the paper are available at: https://github.com/BioMedicalBigDataMiningLabWhu/lncRNA-protein-interaction-prediction . [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
39. Capsule-LPI: a LncRNA–protein interaction predicting tool based on a capsule network
- Author
-
Wei Du, Shiyao Feng, Hang Sun, Ying Li, Siyu Han, and Qi Zhang
- Subjects
Web server ,Computer science ,QH301-705.5 ,lncRNA–protein interaction ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Machine learning ,computer.software_genre ,Biochemistry ,03 medical and health sciences ,0302 clinical medicine ,Structural Biology ,Biology (General) ,Molecular Biology ,Subnetwork ,030304 developmental biology ,0303 health sciences ,business.industry ,Methodology Article ,Applied Mathematics ,Deep learning ,Computational Biology ,Computer Science Applications ,030220 oncology & carcinogenesis ,Key (cryptography) ,RNA, Long Noncoding ,Artificial intelligence ,business ,Capsule network ,computer ,Long noncoding RNA - Abstract
Background Long noncoding RNAs (lncRNAs) play important roles in multiple biological processes. Identifying LncRNA–protein interactions (LPIs) is key to understanding lncRNA functions. Although some LPIs computational methods have been developed, the LPIs prediction problem remains challenging. How to integrate multimodal features from more perspectives and build deep learning architectures with better recognition performance have always been the focus of research on LPIs. Results We present a novel multichannel capsule network framework to integrate multimodal features for LPI prediction, Capsule-LPI. Capsule-LPI integrates four groups of multimodal features, including sequence features, motif information, physicochemical properties and secondary structure features. Capsule-LPI is composed of four feature-learning subnetworks and one capsule subnetwork. Through comprehensive experimental comparisons and evaluations, we demonstrate that both multimodal features and the architecture of the multichannel capsule network can significantly improve the performance of LPI prediction. The experimental results show that Capsule-LPI performs better than the existing state-of-the-art tools. The precision of Capsule-LPI is 87.3%, which represents a 1.7% improvement. The F-value of Capsule-LPI is 92.2%, which represents a 1.4% improvement. Conclusions This study provides a novel and feasible LPI prediction tool based on the integration of multimodal features and a capsule network. A webserver (http://csbg-jlu.site/lpc/predict) is developed to be convenient for users.
- Published
- 2021
40. A Hybrid Prediction Method for Plant lncRNA-Protein Interaction
- Author
-
Jael Sanyanda Wekesa, Yushi Luan, Ming Chen, and Jun Meng
- Subjects
autoencoder ,random forest ,light gradient boosting machine ,hybrid ,lncRNA-protein interaction ,plant ,Cytology ,QH573-671 - Abstract
Long non-protein-coding RNAs (lncRNAs) identification and analysis are pervasive in transcriptome studies due to their roles in biological processes. In particular, lncRNA-protein interaction has plausible relevance to gene expression regulation and in cellular processes such as pathogen resistance in plants. While lncRNA-protein interaction has been studied in animals, there has yet to be extensive research in plants. In this paper, we propose a novel plant lncRNA-protein interaction prediction method, namely PLRPIM, which combines deep learning and shallow machine learning methods. The selection of an optimal feature subset and subsequent efficient compression are significant challenges for deep learning models. The proposed method adopts k-mer and extracts high-level abstraction sequence-based features using stacked sparse autoencoder. Based on the extracted features, the fusion of random forest (RF) and light gradient boosting machine (LGBM) is used to build the prediction model. The performances are evaluated on Arabidopsis thaliana and Zea mays datasets. Results from experiments demonstrate PLRPIM’s superiority compared with other prediction tools on the two datasets. Based on 5-fold cross-validation, we obtain 89.98% and 93.44% accuracy, 0.954 and 0.982 AUC for Arabidopsis thaliana and Zea mays, respectively. PLRPIM predicts potential lncRNA-protein interaction pairs effectively, which can facilitate lncRNA related research including function prediction.
- Published
- 2019
- Full Text
- View/download PDF
41. LPI-EnEDT: an ensemble framework with extra tree and decision tree classifiers for imbalanced lncRNA-protein interaction data classification
- Author
-
Liqian Zhou, Ling Shen, Pengfei Gao, Ruya Yuan, and Lihong Peng
- Subjects
Class imbalance ,Computer science ,Computer applications to medicine. Medical informatics ,Data classification ,R858-859.7 ,Decision tree ,Inference ,Biochemistry ,Prediction methods ,Genetics ,Prediction bias ,lncRNA-protein interaction ,Molecular Biology ,QA299.6-433 ,business.industry ,Research ,Pattern recognition ,Ensemble learning ,Computer Science Applications ,Interaction information ,Computational Mathematics ,Tree (data structure) ,Computational Theory and Mathematics ,Artificial intelligence ,business ,Ensemble ,Analysis - Abstract
Background Long noncoding RNAs (lncRNAs) have dense linkages with various biological processes. Identifying interacting lncRNA-protein pairs contributes to understand the functions and mechanisms of lncRNAs. Wet experiments are costly and time-consuming. Most computational methods failed to observe the imbalanced characterize of lncRNA-protein interaction (LPI) data. More importantly, they were measured based on a unique dataset, which produced the prediction bias. Results In this study, we develop an Ensemble framework (LPI-EnEDT) with Extra tree and Decision Tree classifiers to implement imbalanced LPI data classification. First, five LPI datasets are arranged. Second, lncRNAs and proteins are separately characterized based on Pyfeat and BioTriangle and concatenated as a vector to represent each lncRNA-protein pair. Finally, an ensemble framework with Extra tree and decision tree classifiers is developed to classify unlabeled lncRNA-protein pairs. The comparative experiments demonstrate that LPI-EnEDT outperforms four classical LPI prediction methods (LPI-BLS, LPI-CatBoost, LPI-SKF, and PLIPCOM) under cross validations on lncRNAs, proteins, and LPIs. The average AUC values on the five datasets are 0.8480, 0,7078, and 0.9066 under the three cross validations, respectively. The average AUPRs are 0.8175, 0.7265, and 0.8882, respectively. Case analyses suggest that there are underlying associations between HOTTIP and Q9Y6M1, NRON and Q15717. Conclusions Fusing diverse biological features of lncRNAs and proteins and exploiting an ensemble learning model with Extra tree and decision tree classifiers, this work focus on imbalanced LPI data classification as well as interaction information inference for a new lncRNA (or protein).
- Published
- 2021
42. LPI-deepGBDT: a multiple-layer deep framework based on gradient boosting decision trees for lncRNA–protein interaction identification
- Author
-
Zhao Wang, Xiongfei Tian, Lihong Peng, and Liqian Zhou
- Subjects
Identification methods ,QH301-705.5 ,Computer science ,lncRNA–protein interaction ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Decision tree ,Multiple-layer deep architecture ,Computational biology ,Biochemistry ,Multiple layer ,Structural Biology ,Humans ,Biology (General) ,Molecular Biology ,Gradient boosting decision tree ,Research ,Applied Mathematics ,Decision Trees ,Computational Biology ,Plants ,Ensemble learning ,Computer Science Applications ,Identification (information) ,RNA, Long Noncoding ,Gradient boosting ,DNA microarray ,Predictive modelling - Abstract
Background Long noncoding RNAs (lncRNAs) play important roles in various biological and pathological processes. Discovery of lncRNA–protein interactions (LPIs) contributes to understand the biological functions and mechanisms of lncRNAs. Although wet experiments find a few interactions between lncRNAs and proteins, experimental techniques are costly and time-consuming. Therefore, computational methods are increasingly exploited to uncover the possible associations. However, existing computational methods have several limitations. First, majority of them were measured based on one simple dataset, which may result in the prediction bias. Second, few of them are applied to identify relevant data for new lncRNAs (or proteins). Finally, they failed to utilize diverse biological information of lncRNAs and proteins. Results Under the feed-forward deep architecture based on gradient boosting decision trees (LPI-deepGBDT), this work focuses on classify unobserved LPIs. First, three human LPI datasets and two plant LPI datasets are arranged. Second, the biological features of lncRNAs and proteins are extracted by Pyfeat and BioProt, respectively. Thirdly, the features are dimensionally reduced and concatenated as a vector to represent an lncRNA–protein pair. Finally, a deep architecture composed of forward mappings and inverse mappings is developed to predict underlying linkages between lncRNAs and proteins. LPI-deepGBDT is compared with five classical LPI prediction models (LPI-BLS, LPI-CatBoost, PLIPCOM, LPI-SKF, and LPI-HNM) under three cross validations on lncRNAs, proteins, lncRNA–protein pairs, respectively. It obtains the best average AUC and AUPR values under the majority of situations, significantly outperforming other five LPI identification methods. That is, AUCs computed by LPI-deepGBDT are 0.8321, 0.6815, and 0.9073, respectively and AUPRs are 0.8095, 0.6771, and 0.8849, respectively. The results demonstrate the powerful classification ability of LPI-deepGBDT. Case study analyses show that there may be interactions between GAS5 and Q15717, RAB30-AS1 and O00425, and LINC-01572 and P35637. Conclusions Integrating ensemble learning and hierarchical distributed representations and building a multiple-layered deep architecture, this work improves LPI prediction performance as well as effectively probes interaction data for new lncRNAs/proteins.
- Published
- 2021
43. A text feature-based approach for literature mining of lncRNA–protein interactions.
- Author
-
Li, Ao, Zang, Qiguang, Sun, Dongdong, and Wang, Minghui
- Subjects
- *
RNA-protein interactions , *SENTENCES (Grammar) , *TEXT mining , *DATA extraction , *NATURAL language processing , *CLASSIFICATION algorithms - Abstract
Long non-coding RNAs (lncRNAs) play important roles in regulating transcriptional and post-transcriptional levels. Currently, Knowledge of lncRNA and protein interactions (LPIs) is crucial for biomedical researches that are related to lncRNA. Many freshly discovered LPIs are stored in biomedical literature. With over one million new biomedical journal articles published every year, just keeping up with the novel finding requires automatically extracting information by text mining. To address this issue, we apply a text feature-based text mining approach to efficiently extract LPIs from biomedical literatures. Our approach consists of four steps. By employ natural language processing (NLP) technologies, this approach extracts text features from sentences that can precisely reflect the real LPIs. Our approach involves four steps including data collection, text pre-processing, structured representation, features extraction and training model and classification. The F -score performance of our approach achieves 79.5%, and the results indicate that the proposed approach can efficiently extract LPIs from biomedical literature. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
44. Relevance search for predicting lncRNA–protein interactions based on heterogeneous network.
- Author
-
Yang, Jianghong, Li, Ao, Ge, Mengqu, and Wang, Minghui
- Subjects
- *
RNA-protein interactions , *BIOLOGICAL networks , *PROTEIN-protein interactions , *DATA extraction , *CLASSIFICATION algorithms - Abstract
lncRNA plays important roles in many biological and pathological processes. lncRNA–protein interaction is the most common way of lncRNA performing their functions. Thus, predicting lncRNA–protein interaction is very significant to understand the nature of lncRNA. Earlier methods to predict RNA–protein interaction always adopt classification algorithms using features extracted from RNA and protein themselves. But their performance is not good enough in lncRNA–protein interaction prediction because of lncRNA’s low conservation. In this paper, we try to use information implicit in the topologies of biological network associated with lncRNA to solve this problem. Firstly we construct a heterogeneous lncRNA–protein network which incorporate protein interaction information. We use an algorithm called HeteSim which can evaluate relevance between heterogeneous objects to perform relevance search in the lncRNA and protein heterogeneous network. The relevance search results are used to help the identification of lncRNA–protein interactions. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
45. LPI-KTASLP: Prediction of LncRNA-Protein Interaction by Semi-Supervised Link Learning With Multivariate Information
- Author
-
Jijun Tang, Fei Guo, Cong Shen, Yijie Ding, and Limin Jiang
- Subjects
0301 basic medicine ,Multivariate statistics ,General Computer Science ,Computer science ,Computational intelligence ,computer.software_genre ,Reduction (complexity) ,LncRNA-protein interaction ,03 medical and health sciences ,0302 clinical medicine ,Gene expression ,semi-supervised link prediction ,General Materials Science ,Nucleotide ,Epigenetics ,multiple kernel learning ,Eigendecomposition of a matrix ,low-rank approximation ,chemistry.chemical_classification ,General Engineering ,kernel target alignment ,RNA ,Identification (information) ,030104 developmental biology ,chemistry ,Polynucleotide ,030220 oncology & carcinogenesis ,Kernel (statistics) ,Benchmark (computing) ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,Data mining ,lcsh:TK1-9971 ,computer - Abstract
Long non-coding RNA, also known as lncRNA, is a series of single-stranded polynucleotides (no less than 200 nucleotides each), consisting of non-protein coding transcripts. LncRNA plays a crucial role in regulating gene expression, during the transcriptional, post-transcriptional, and epigenetic processes. This is achieved by lncRNA interacts with the corresponding RNA-binding proteins. It has been drawn to a lot of attention that the reduction of the excessive laboratory cost and the increase in speed and accuracy gains benefits from the employment of computational intelligence in lncRNA-protein interaction (LPI) identification. Although numerous pertinent in silico studies of LPI prediction have been proposed, there is still room for enhancing the accuracy of the existing LPI prediction methods. In this paper, we have proposed a novel method for identifying LPI with kernel target alignment based on semi-supervised link prediction (LPI-KTASLP), which adopts multivariate information to predict lncRNAs-proteins interactions. To integrate the heterogeneous kernels, kernel target alignment has been applied to deal with kernel fusion. We have calculated the low-rank approximation matrices of lncRNA and protein, where eigendecomposition is used to reduce computing pressure. The prediction model has been obtained by producing the ultimate LPI prediction matrix. Experimental results show that the prediction ability of the LPI-KTASLP algorithm has surpassed many other LPI prediction schemes. Our method of lncRNA-protein interaction prediction has been evaluated on a standard benchmark dataset of LPIs. We have observed that the highest AUPR of 0.6148 is obtained by our proposed model (LPI-KTASLP). This is superior to the integrated LPLNP (AUPR: 0.4584), the RWR (AUPR: 0.2827), the CF (AUPR: 0.2357), the LPIHN (AUPR: 0.2299), and the LPBNI (AUPR: 0.3302). It is very encouraging that most of the LPI predictions have been confirmed to be close to relevant concentrations.
- Published
- 2019
46. LPIH2V: LncRNA-protein interactions prediction using HIN2Vec based on heterogeneous networks model.
- Author
-
Wei MM, Yu CQ, Li LP, You ZH, Ren ZH, Guan YJ, Wang XF, and Li YC
- Abstract
LncRNA-protein interaction plays an important role in the development and treatment of many human diseases. As the experimental approaches to determine lncRNA-protein interactions are expensive and time-consuming, considering that there are few calculation methods, therefore, it is urgent to develop efficient and accurate methods to predict lncRNA-protein interactions. In this work, a model for heterogeneous network embedding based on meta-path, namely LPIH2V, is proposed. The heterogeneous network is composed of lncRNA similarity networks, protein similarity networks, and known lncRNA-protein interaction networks. The behavioral features are extracted in a heterogeneous network using the HIN2Vec method of network embedding. The results showed that LPIH2V obtains an AUC of 0.97 and ACC of 0.95 in the 5-fold cross-validation test. The model successfully showed superiority and good generalization ability. Compared to other models, LPIH2V not only extracts attribute characteristics by similarity, but also acquires behavior properties by meta-path wandering in heterogeneous networks. LPIH2V would be beneficial in forecasting interactions between lncRNA and protein., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2023 Wei, Yu, Li, You, Ren, Guan, Wang and Li.)
- Published
- 2023
- Full Text
- View/download PDF
47. Predicting lncRNA-protein interactions with bipartite graph embedding and deep graph neural networks.
- Author
-
Ma Y, Zhang H, Jin C, and Kang C
- Abstract
Background: Long non-coding RNAs (lncRNAs) play crucial roles in numerous biological processes. Investigation of the lncRNA-protein interaction contributes to discovering the undetected molecular functions of lncRNAs. In recent years, increasingly computational approaches have substituted the traditional time-consuming experiments utilized to crack the possible unknown associations. However, significant explorations of the heterogeneity in association prediction between lncRNA and protein are inadequate. It remains challenging to integrate the heterogeneity of lncRNA-protein interactions with graph neural network algorithms. Methods: In this paper, we constructed a deep architecture based on GNN called BiHo-GNN, which is the first to integrate the properties of homogeneous with heterogeneous networks through bipartite graph embedding. Different from previous research, BiHo-GNN can capture the mechanism of molecular association by the data encoder of heterogeneous networks. Meanwhile, we design the process of mutual optimization between homogeneous and heterogeneous networks, which can promote the robustness of BiHo-GNN. Results: We collected four datasets for predicting lncRNA-protein interaction and compared the performance of current prediction models on benchmarking dataset. In comparison with the performance of other models, BiHo-GNN outperforms existing bipartite graph-based methods. Conclusion: Our BiHo-GNN integrates the bipartite graph with homogeneous graph networks. Based on this model structure, the lncRNA-protein interactions and potential associations can be predicted and discovered accurately., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2023 Ma, Zhang, Jin and Kang.)
- Published
- 2023
- Full Text
- View/download PDF
48. Predicting lncRNA–Protein Interactions With miRNAs as Mediators in a Heterogeneous Network Model
- Author
-
Pufeng Du, Yuan-Ke Zhou, Zi-Ang Shen, Yang Gao, Tao Luo, and Han Yu
- Subjects
0301 basic medicine ,lcsh:QH426-470 ,business.industry ,Computer science ,lncRNA–protein interaction ,heterogeneous network ,network similarity ,Machine learning ,computer.software_genre ,Protein–protein interaction ,03 medical and health sciences ,miRNA–protein interaction ,lcsh:Genetics ,030104 developmental biology ,0302 clinical medicine ,030220 oncology & carcinogenesis ,lncRNA–miRNA interaction ,Genetics ,Methods ,Molecular Medicine ,Artificial intelligence ,business ,computer ,Genetics (clinical) ,Heterogeneous network - Abstract
Long non-coding RNAs (lncRNAs) play important roles in various biological processes, where lncRNA-protein interactions are usually involved. Therefore, identifying lncRNA-protein interactions is of great significance to understand the molecular functions of lncRNAs. Since the experiments to identify lncRNA-protein interactions are always costly and time consuming, computational methods are developed as alternative approaches. However, existing lncRNA-protein interaction predictors usually require prior knowledge of lncRNA-protein interactions with experimental evidences. Their performances are limited due to the number of known lncRNA-protein interactions. In this paper, we explored a novel way to predict lncRNA-protein interactions without direct prior knowledge. MiRNAs were picked up as mediators to estimate potential interactions between lncRNAs and proteins. By validating our results based on known lncRNA-protein interactions, our method achieved an AUROC (Area Under Receiver Operating Curve) of 0.821, which is comparable to the state-of-the-art methods. Moreover, our method achieved an improved AUROC of 0.852 by further expanding the training dataset. We believe that our method can be a useful supplement to the existing methods, as it provides an alternative way to estimate lncRNA-protein interactions in a heterogeneous network without direct prior knowledge. All data and codes of this work can be downloaded from GitHub (https://github.com/zyk2118216069/LncRNA-protein-interactions-prediction).
- Published
- 2020
49. The Landscape of Long Non-Coding RNA Dysregulation and Clinical Relevance in Muscle Invasive Bladder Urothelial Carcinoma
- Author
-
Lindsay M. Wong, Haotian Shen, Rachel High, Jessica Wang-Rodriguez, Eric Y. Chang, Wei Tse Li, Megan Chu, and Weg M. Ongkeko
- Subjects
0301 basic medicine ,Urologic Diseases ,bladder carcinoma ,Cancer Research ,Bladder Urothelial Carcinoma ,Oncology and Carcinogenesis ,lncRNAs ,medicine.disease_cause ,Article ,03 medical and health sciences ,0302 clinical medicine ,Cancer genome ,Genetics ,Medicine ,2.1 Biological and endogenous factors ,Clinical significance ,lncRNA-protein interaction ,Cancer ,Bladder cancer ,business.industry ,Human Genome ,Muscle invasive ,Patient survival ,TCGA ,medicine.disease ,Long non-coding RNA ,030104 developmental biology ,Oncology ,030220 oncology & carcinogenesis ,Cancer research ,business ,Carcinogenesis ,Biotechnology - Abstract
Bladder cancer is one of the most common cancers in the United States, but few advancements in treatment options have occurred in the past few decades. This study aims to identify the most clinically relevant long non-coding RNAs (lncRNAs) to serve as potential biomarkers and treatment targets for muscle invasive bladder cancer (MIBC). Using RNA-sequencing data from 406 patients in The Cancer Genome Atlas (TCGA) database, we identified differentially expressed lncRNAs in MIBC vs. normal tissues. We then associated lncRNA expression with patient survival, clinical variables, oncogenic signatures, cancer- and immune-associated pathways, and genomic alterations. We identified a panel of 20 key lncRNAs that were most implicated in MIBC prognosis after differential expression analysis and prognostic correlations. Almost all lncRNAs we identified are correlated significantly with oncogenic processes. In conclusion, we discovered previously undescribed lncRNAs strongly implicated in the MIBC disease course that may be leveraged for diagnostic and treatment purposes in the future. Functional analysis of these lncRNAs may also reveal distinct mechanisms of bladder cancer carcinogenesis.
- Published
- 2019
- Full Text
- View/download PDF
50. Fusing multiple protein-protein similarity networks to effectively predict lncRNA-protein interactions
- Author
-
Libo Luo, Shuigeng Zhou, Kai Tian, Xiaoxiong Zheng, Yang Wang, Jiaogen Zhou, and Jihong Guan
- Subjects
0301 basic medicine ,Protein domain ,Computational biology ,Random walk ,Biology ,computer.software_genre ,lcsh:Computer applications to medicine. Medical informatics ,Biochemistry ,Protein–protein interaction ,03 medical and health sciences ,0302 clinical medicine ,Similarity (network science) ,Structural Biology ,Area under curve ,Humans ,Molecular Biology ,lcsh:QH301-705.5 ,String database ,Regulation of gene expression ,Sequence Homology, Amino Acid ,Applied Mathematics ,Protein protein ,Research ,Proteins ,Computer Science Applications ,Similarity network fusion ,030104 developmental biology ,ROC Curve ,lcsh:Biology (General) ,Area Under Curve ,lcsh:R858-859.7 ,RNA, Long Noncoding ,lncRNA-Protein Interaction ,Data mining ,DNA microarray ,computer ,030217 neurology & neurosurgery - Abstract
Background Long non-coding RNA (lncRNA) plays important roles in many biological and pathological processes, including transcriptional regulation and gene regulation. As lncRNA interacts with multiple proteins, predicting lncRNA-protein interactions (lncRPIs) is an important way to study the functions of lncRNA. Up to now, there have been a few works that exploit protein-protein interactions (PPIs) to help the prediction of new lncRPIs. Results In this paper, we propose to boost the prediction of lncRPIs by fusing multiple protein-protein similarity networks (PPSNs). Concretely, we first construct four PPSNs based on protein sequences, protein domains, protein GO terms and the STRING database respectively, then build a more informative PPSN by fusing these four constructed PPSNs. Finally, we predict new lncRPIs by a random walk method with the fused PPSN and known lncRPIs. Our experimental results show that the new approach outperforms the existing methods. Conclusion Fusing multiple protein-protein similarity networks can effectively boost the performance of predicting lncRPIs.
- Published
- 2017
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.