Back to Search Start Over

GPApred: The first computational predictor for identifying proteins with LPXTG-like motif using sequence-based optimal features.

Authors :
Malik A
Shoombuatong W
Kim CB
Manavalan B
Source :
International journal of biological macromolecules [Int J Biol Macromol] 2023 Feb 28; Vol. 229, pp. 529-538. Date of Electronic Publication: 2022 Dec 31.
Publication Year :
2023

Abstract

The cell surface proteins of gram-positive bacteria are involved in many important biological functions, including the infection of host cells. Owing to their virulent nature, these proteins are also considered strong candidates for potential drug or vaccine targets. Among the various cell surface proteins of gram-positive bacteria, LPXTG-like proteins form a major class. These proteins have a highly conserved C-terminal cell wall sorting signal, which consists of an LPXTG sequence motif, a hydrophobic domain, and a positively charged tail. These surface proteins are targeted to the cell envelope by a sortase enzyme via transpeptidation. A variety of LPXTG-like proteins have been experimentally characterized; however, their number in public databases has increased owing to extensive bacterial genome sequencing without proper annotation. In the absence of experimental characterization, identifying and annotating these sequences is extremely challenging. Therefore, in this study, we developed the first machine learning-based predictor called GPApred, which can identify LPXTG-like proteins from their primary sequences. Using a newly constructed benchmark dataset, we explored different classifiers and five feature encodings and their hybrids. Optimal features were derived using the recursive feature elimination method, and these features were then trained using a support vector machine algorithm. The performance of different models was evaluated using independent datasets, and a final model (GPApred) was selected based on consistency during cross-validation and independent assessment. GPApred can be an effective tool for predicting LPXTG-like sequences and can be further employed for functional characterization or drug targeting. Availability: https://procarb.org/gpapred/.<br />Competing Interests: Declaration of competing interest The authors declare no competing interests.<br /> (Copyright © 2023 The Authors. Published by Elsevier B.V. All rights reserved.)

Details

Language :
English
ISSN :
1879-0003
Volume :
229
Database :
MEDLINE
Journal :
International journal of biological macromolecules
Publication Type :
Academic Journal
Accession number :
36596370
Full Text :
https://doi.org/10.1016/j.ijbiomac.2022.12.315