Back to Search Start Over

Predictive modeling of Pseudomonas syringae virulence on bean using gradient boosted decision trees.

Authors :
Almeida, Renan N. D.
Greenberg, Michael
Bundalovic-Torma, Cedoljub
Martel, Alexandre
Wang, Pauline W.
Middleton, Maggie A.
Chatterton, Syama
Desveaux, Darrell
Guttman, David S.
Source :
PLoS Pathogens; 7/25/2022, Vol. 18 Issue 7, p1-24, 24p
Publication Year :
2022

Abstract

Pseudomonas syringae is a genetically diverse bacterial species complex responsible for numerous agronomically important crop diseases. Individual P. syringae isolates are assigned pathovar designations based on their host of isolation and the associated disease symptoms, and these pathovar designations are often assumed to reflect host specificity although this assumption has rarely been rigorously tested. Here we developed a rapid seed infection assay to measure the virulence of 121 diverse P. syringae isolates on common bean (Phaseolus vulgaris). This collection includes P. syringae phylogroup 2 (PG2) bean isolates (pathovar syringae) that cause bacterial spot disease and P. syringae phylogroup 3 (PG3) bean isolates (pathovar phaseolicola) that cause the more serious halo blight disease. We found that bean isolates in general were significantly more virulent on bean than non-bean isolates and observed no significant virulence difference between the PG2 and PG3 bean isolates. However, when we compared virulence within PGs we found that PG3 bean isolates were significantly more virulent than PG3 non-bean isolates, while there was no significant difference in virulence between PG2 bean and non-bean isolates. These results indicate that PG3 strains have a higher level of host specificity than PG2 strains. We then used gradient boosting machine learning to predict each strain's virulence on bean based whole genome k-mers, type III secreted effector k-mers, and the presence/absence of type III effectors and phytotoxins. Our model performed best using whole genome data and was able to predict virulence with high accuracy (mean absolute error = 0.05). Finally, we functionally validated the model by predicting virulence for 16 strains and found that 15 (94%) had virulence levels within the bounds of estimated predictions. This study strengthens the hypothesis that P. syringae PG2 strains have evolved a different lifestyle than other P. syringae strains as reflected in their lower level of host specificity. It also acts as a proof-of-principle to demonstrate the power of machine learning for predicting host specific adaptation. Author summary: Pseudomonas syringae is a genetically diverse Gammaproteobacterial species complex responsible for numerous agronomically important crop diseases. Strains in the P. syringae species complex are frequently categorized into pathovars depending on pathogenic characteristics such as host of isolation and disease symptoms. Common bean pathogens from P. syringae are known to cause two major diseases: (1) pathovar phaseolicola strains from phylogroup 3 cause halo blight disease, characterized by large necrotic lesions surrounded by a chlorotic zone or halo of yellow tissue; and (2) pathovar syringae strains from phylogroup 2 causes bacterial spot disease, characterized by brown leaf spots. While halo blight can cause serious crop losses, bacterial spot disease is generally of minor agronomic concern. Recently, statistical genetic and machine learning approaches have been applied to genomic data to identify genes underlying traits of interest or predict the outcome of host-microbe interactions. Here, we apply machine learning to P. syringae genomic data to predict virulence on bean. We first characterized the virulence of P. syringae isolates on common bean using a seed infection assay and then applied machine learning to the genomic data from the same strains to generate a predictive model for virulence on bean. We found that machine learning models built with k-mers from either full genome data or virulence factors could predict bean virulence with high accuracy. We also confirmed prior work showing that phylogroup 3 halo blight pathogens display a stronger degree of phylogenetic clustering and host specificity compared to phylogroup 2 brown spot pathogens. This works serves as a proof-of-principle for the power of machine learning for predicting host specificity and may find utility in agricultural diagnostic microbiology. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15537366
Volume :
18
Issue :
7
Database :
Complementary Index
Journal :
PLoS Pathogens
Publication Type :
Academic Journal
Accession number :
158161142
Full Text :
https://doi.org/10.1371/journal.ppat.1010716