1. Deep generative models for T cell receptor protein sequences
- Author
-
Elias Harkins, Philip Bradley, Branden J. Olson, Kristian Davidsen, William S DeWitt, Jean Feng, and Frederick A. Matsen
- Subjects
0301 basic medicine ,Computer science ,none ,Receptors, Antigen, T-Cell, alpha-beta ,Parameterized complexity ,Adaptive Immunity ,immunology ,0302 clinical medicine ,Immunology and Inflammation ,computational biology ,Models ,T cell expansion ,vaccine ,Receptors ,variational autoencoder ,Biology (General) ,Recombination, Genetic ,alpha-beta ,General Neuroscience ,Repertoire ,systems biology ,General Medicine ,Tools and Resources ,Antigen ,Deep neural networks ,Medicine ,Computational and Systems Biology ,Biotechnology ,QH301-705.5 ,Systems biology ,Science ,Computational biology ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,Genetic ,None ,Genetics ,Humans ,General Immunology and Microbiology ,Models, Genetic ,T-cell receptor ,Probabilistic logic ,Genetic Variation ,T-Cell ,Autoencoder ,Recombination ,030104 developmental biology ,inflammation ,Biochemistry and Cell Biology ,T cell receptor ,030217 neurology & neurosurgery ,Generative grammar ,repertoire modeling - Abstract
Probabilistic models of adaptive immune repertoire sequence distributions can be used to infer the expansion of immune cells in response to stimulus, differentiate genetic from environmental factors that determine repertoire sharing, and evaluate the suitability of various target immune sequences for stimulation via vaccination. Classically, these models are defined in terms of a probabilistic V(D)J recombination model which is sometimes combined with a selection model. In this paper we take a different approach, fitting variational autoencoder (VAE) models parameterized by deep neural networks to T cell receptor (TCR) repertoires. We show that simple VAE models can perform accurate cohort frequency estimation, learn the rules of VDJ recombination, and generalize well to unseen sequences. Further, we demonstrate that VAE-like models can distinguish between real sequences and sequences generated according to a recombination-selection model, and that many characteristics of VAE-generated sequences are similar to those of real sequences.
- Published
- 2019