1. ncRNA-protein Interaction Prediction using Language-based Features
- Author
-
Shah, Krishna
- Subjects
- Machine Learning, Transformer, RNA-Protein Interaction, NLP, noncoding RNA, Bioinformatics
- Abstract
Noncoding RNAs (ncRNAs) play a significant role in several fundamental biological processes by binding to RNA-binding proteins (RBPs); hence, it is necessary to study ncRNA-protein interaction (RPI). Several classic and deep-learning machine learning models have been pro-posed to predict RPI. These models first need to collect features of RNA and protein, such as physicochemical properties, secondary and tertiary structure, et cetera, before feeding them into the model. More recently, after the advancement of high throughput sequenc-ing and the improvement in Natural Language Processing (NLP), transformer models like BERT-RBP and Evolutionary Scaling Model (ESM) can be trained to automatically extract feature representations, containing both low and high-level information, from RNA and pro-tein sequences directly. This method could make manual feature collection optional. Hence, in this study, we compare the performance of such language-based features against manually created features to predict the interaction probability between a protein and an RNA.
- Published
- 2022