Back to Search Start Over

Two novel approaches to identify phage receptor-binding protein sequences

Authors :
Boeckaerts, Dimitri
Stock, Michiel
De Baets, Bernard
Briers, Yves
Source :
Viruses of Microbes 2022, Abstracts
Publication Year :
2022

Abstract

Receptor-binding proteins (RBPs) of bacteriophages initiate the infection of their corresponding bacterial host and act as the primary determinant for host specificity. The ever-increasing amount of sequence data enables the development of predictive models for the automated identification of RBP sequences. However, the development of such models is challenged by the inconsistent or missing annotation of many phage proteins. Recently developed tools have started to bridge this gap but are not specifically focused on RBP sequences, for which many different annotations are available. We have developed two parallel approaches to alleviate the complex identification of RBP sequences in phage genomic data. The first combines known RBP-related hidden Markov models (HMMs) from the Pfam database with custom-built HMMs to identify phage RBPs based on protein domains. The second approach consists of training an extreme gradient boosting classifier that can accurately discriminate between RBPs and other phage proteins. We explained how these complementary approaches can reinforce each other in identifying RBP sequences. In addition, we benchmarked our methods against the recently developed PhANNs tool. Our best performing model reached a precision-recall area-under-the-curve of 93.8% and outperformed PhANNs on an independent test set, reaching an F1-score of 84.0% compared to 69.8%

Details

Language :
English
Database :
OpenAIRE
Journal :
Viruses of Microbes 2022, Abstracts
Accession number :
edsair.od.......330..6ac5231330d1c8a95b941826872b1bf8