1. A generic deep convolutional neural network framework for prediction of receptor-ligand interactions-NetPhosPan: application to kinase phosphorylation prediction
- Author
-
Søren Brunak, Jose M. G. Izarzugaza, Morten Nielsen, Vanessa Isabell Jurtz, and Emilio Fenoy
- Subjects
Statistics and Probability ,Protein family ,Computer science ,B-cell receptor ,Computational biology ,Ligands ,Biochemistry ,Convolutional neural network ,03 medical and health sciences ,Protein structure ,Phosphorylation ,Receptor ,Protein kinase A ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Artificial neural network ,Kinase ,Ligand ,030302 biochemistry & molecular biology ,Proteins ,SUPERFAMILY ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Neural Networks, Computer ,Protein Kinases - Abstract
Motivation Understanding the specificity of protein receptor–ligand interactions is pivotal for our comprehension of biological mechanisms and systems. Receptor protein families often have a certain level of sequence diversity that converges into fewer conserved protein structures, allowing the exertion of well-defined functions. T and B cell receptors of the immune system and protein kinases that control the dynamic behaviour and decision processes in eukaryotic cells by catalysing phosphorylation represent prime examples. Driven by the large sequence diversity, the receptors within such protein families are often found to share specificities although divergent at the sequence level. This observation has led to the notion that prediction models of such systems are most effectively handled in a receptor-specific manner. Results We show that this approach in many cases is suboptimal, and describe an alternative improved framework for generating models with pan-receptor-predictive power for receptor protein families. The framework is based on deep artificial neural networks and integrates information from individual receptors into a single pan-receptor model, leveraging information across multiple receptor-specific datasets allowing predictions of the receptor specificity for all members of a given protein family including those described by limited or no ligand data. The approach was applied to the protein kinase superfamily, leading to the method NetPhosPan. The method was extensively validated and benchmarked against state-of-the-art prediction methods and was found to have unprecedented performance in particularly for kinase domains characterized by limited or no experimental data. Availability and implementation The method is freely available to non-commercial users and can be downloaded at http://www.cbs.dtu.dk/services/NetPhospan-1.0. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2018