Back to Search Start Over

PMSFF: Improved Protein Binding Residues Prediction through Multi-Scale Sequence-Based Feature Fusion Strategy.

Authors :
Li Y
Nan X
Zhang S
Zhou Q
Lu S
Tian Z
Source :
Biomolecules [Biomolecules] 2024 Sep 27; Vol. 14 (10). Date of Electronic Publication: 2024 Sep 27.
Publication Year :
2024

Abstract

Proteins perform different biological functions through binding with various molecules which are mediated by a few key residues and accurate prediction of such protein binding residues (PBRs) is crucial for understanding cellular processes and for designing new drugs. Many computational prediction approaches have been proposed to identify PBRs with sequence-based features. However, these approaches face two main challenges: (1) these methods only concatenate residue feature vectors with a simple sliding window strategy, and (2) it is challenging to find a uniform sliding window size suitable for learning embeddings across different types of PBRs. In this study, we propose one novel framework that could apply multiple types of PBRs P rediciton task through M ulti-scale S equence-based F eature F usion (PMSFF) strategy. Firstly, PMSFF employs a pre-trained language model named ProtT5, to encode amino acid residues in protein sequences. Then, it generates multi-scale residue embeddings by applying multi-size windows to capture effective neighboring residues and multi-size kernels to learn information across different scales. Additionally, the proposed model treats protein sequences as sentences, employing a bidirectional GRU to learn global context. We also collect benchmark datasets encompassing various PBRs types and evaluate our PMSFF approach to these datasets. Compared with state-of-the-art methods, PMSFF demonstrates superior performance on most PBRs prediction tasks.

Details

Language :
English
ISSN :
2218-273X
Volume :
14
Issue :
10
Database :
MEDLINE
Journal :
Biomolecules
Publication Type :
Academic Journal
Accession number :
39456153
Full Text :
https://doi.org/10.3390/biom14101220