Back to Search Start Over

mtx-COBRA: Subcellular localization prediction for bacterial proteins.

Authors :
Arora I
Kummer A
Zhou H
Gadjeva M
Ma E
Chuang GY
Ong E
Source :
Computers in biology and medicine [Comput Biol Med] 2024 Mar; Vol. 171, pp. 108114. Date of Electronic Publication: 2024 Feb 10.
Publication Year :
2024

Abstract

Background: Bacteria can have beneficial effects on our health and environment; however, many are responsible for serious infectious diseases, warranting the need for vaccines against such pathogens. Bioinformatic and experimental technologies are crucial for the development of vaccines. The vaccine design pipeline requires identification of bacteria-specific antigens that can be recognized and can induce a response by the immune system upon infection. Immune system recognition is influenced by the location of a protein. Methods have been developed to determine the subcellular localization (SCL) of proteins in prokaryotes and eukaryotes. Bioinformatic tools such as PSORTb can be employed to determine SCL of proteins, which would be tedious to perform experimentally. Unfortunately, PSORTb often predicts many proteins as having an "Unknown" SCL, reducing the number of antigens to evaluate as potential vaccine targets.<br />Method: We present a new pipeline called subCellular lOcalization prediction for BacteRiAl Proteins (mtx-COBRA). mtx-COBRA uses Meta's protein language model, Evolutionary Scale Modeling, combined with an Extreme Gradient Boosting machine learning model to identify SCL of bacterial proteins based on amino acid sequence. This pipeline is trained on a curated dataset that combines data from UniProt and the publicly available ePSORTdb dataset.<br />Results: Using benchmarking analyses, nested 5-fold cross-validation, and leave-one-pathogen-out methods, followed by testing on the held-out dataset, we show that our pipeline predicts the SCL of bacterial proteins more accurately than PSORTb.<br />Conclusions: mtx-COBRA provides an accessible pipeline that can more efficiently classify bacterial proteins with currently "Unknown" SCLs than existing bioinformatic and experimental methods.<br />Competing Interests: Declaration of competing interest At the time of this study, all authors were employed by Moderna, Inc., and may hold stock/stock options in the company.<br /> (Copyright © 2024 Elsevier Ltd. All rights reserved.)

Details

Language :
English
ISSN :
1879-0534
Volume :
171
Database :
MEDLINE
Journal :
Computers in biology and medicine
Publication Type :
Academic Journal
Accession number :
38401450
Full Text :
https://doi.org/10.1016/j.compbiomed.2024.108114