1. Leveraging implicit knowledge in neural networks for functional dissection and engineering of proteins
- Author
-
Daniel Heid, Dominik Niopek, Max C. Waldhauer, Irina Lehmann, Moritz Jakob Przybilla, Pauline L. Pfuderer, Roland Eils, Thore Bürgel, Catharina Gandor, Marita Klein, Mareike D. Hoffmann, Carolin Schmelas, Felix Bubeck, Julius Upmeier zu Belzen, Lukas Platz, Jan Mathony, Lukas Adam, Stefan Holderbach, Max Schwendemann, and Michael Jendrusch
- Subjects
0301 basic medicine ,Artificial neural network ,Computer Networks and Communications ,Computer science ,Computational biology ,Protein engineering ,Molecular machine ,Implicit knowledge ,Human-Computer Interaction ,03 medical and health sciences ,030104 developmental biology ,0302 clinical medicine ,Protein sequencing ,Signalling ,Artificial Intelligence ,Leverage (statistics) ,Computer Vision and Pattern Recognition ,Small molecule binding ,030217 neurology & neurosurgery ,Software - Abstract
Proteins are nature’s most versatile molecular machines. Deep neural networks trained on large protein datasets have recently been used to tackle the unmet complexity of protein sequence–function relationships. The implicit knowledge contained in these networks represents a powerful, but thus far inaccessible, resource for understanding protein biology. Here, we show that occlusion-based sensitivity analysis can leverage the knowledge present in deep-neural-network-based protein sequence classifiers to identify functionally relevant parts of proteins. We first validated our approach by successfully predicting positions that mediate small molecule binding or catalytic activity across different protein classes. Next, we inferred the impact of point mutations on the activity of ERK and HRas, signalling factors frequently deregulated in cancer. Finally, we used our approach to identify engineering hotspots in CRISPR–Cas9 and anti-CRISPR protein AcrIIA4. Our work demonstrates how implicit knowledge in neural networks can be harnessed for protein functional dissection and protein engineering. Deep neural networks are a powerful tool for predicting protein function, but identifying the specific parts of a protein sequence that are relevant to its functions remains a challenge. An occlusion-based sensitivity technique helps interpret these deep neural networks, and can guide protein engineering by locating functionally relevant protein positions.
- Published
- 2019
- Full Text
- View/download PDF