Back to Search Start Over

Using the antibody-antigen binding interface to train image-based deep neural networks for antibody-epitope classification

Authors :
Daniel R. Ripoll
Sidhartha Chaudhury
Anders Wallqvist
Source :
PLoS Computational Biology, Vol 17, Iss 3, p e1008864 (2021), PLoS Computational Biology
Publication Year :
2021
Publisher :
Public Library of Science (PLoS), 2021.

Abstract

High-throughput B-cell sequencing has opened up new avenues for investigating complex mechanisms underlying our adaptive immune response. These technological advances drive data generation and the need to mine and analyze the information contained in these large datasets, in particular the identification of therapeutic antibodies (Abs) or those associated with disease exposure and protection. Here, we describe our efforts to use artificial intelligence (AI)-based image-analyses for prospective classification of Abs based solely on sequence information. We hypothesized that Abs recognizing the same part of an antigen share a limited set of features at the binding interface, and that the binding site regions of these Abs share share common structure and physicochemical property patterns that can serve as a “fingerprint” to recognize uncharacterized Abs. We combined large-scale sequence-based protein-structure predictions to generate ensembles of 3-D Ab models, reduced the Ab binding interface to a 2-D image (fingerprint), used pre-trained convolutional neural networks to extract features, and trained deep neural networks (DNNs) to classify Abs. We evaluated this approach using Ab sequences derived from human HIV and Ebola viral infections to differentiate between two Abs, Abs belonging to specific B-cell family lineages, and Abs with different epitope preferences. In addition, we explored a different type of DNN method to detect one class of Abs from a larger pool of Abs. Testing on Ab sets that had been kept aside during model training, we achieved average prediction accuracies ranging from 71–96% depending on the complexity of the classification task. The high level of accuracies reached during these classification tests suggests that the DNN models were able to learn a series of structural patterns shared by Abs belonging to the same class. The developed methodology provides a means to apply AI-based image recognition techniques to analyze high-throughput B-cell sequencing datasets (repertoires) for Ab classification.<br />Author summary The ability to take advantage of the rapid progress in AI for biological and medical application oftentimes requires looking at the problem from a non-traditional point-of-view. The adaptive immune system plays a key role in providing long-term immunity against pathogens. The repertoire of circulating B-cells that produce unique pathogen-specific antibodies in an individual contains immense information on both the status of the immune response at particular time and that individual’s immune history. With high-throughput sequencing, we can now obtain Ab sequences for thousands of B cells from a single patient blood sample, but functionally characterizing antibodies on this scale remains on daunting task. Here, we propose to use AI to functionally classify Abs from sequence alone by re-casting this classification problem as an image recognition problem. Just as traditional image recognition involves training AI to distinguish different types of objects, we sought to use AI to distinguish different types of Ab-antigen binding interfaces. Towards that end, we generated ensembles of Ab structures from sequence, and generated 2-D ‘fingerprints’ of each structure that captures the essential molecular and chemical structure of the Ab binding site regions, and trained a Convolution and Deep Neural Network based AI model to classify Ab fingerprints associated with different functional characteristics. We applied this DNN-based approach to accurately predict antibody family lineage and epitope specificity against Ebola and HIV-1 viruses, and to detect sequence-diverse antibodies with similar binding properties as the ones we used for training.

Details

Language :
English
ISSN :
15537358
Volume :
17
Issue :
3
Database :
OpenAIRE
Journal :
PLoS Computational Biology
Accession number :
edsair.doi.dedup.....460d18315d9872d6d77c8497c96699d2