1. Predicting protein variants with equivariant graph neural networks
- Author
-
Boca, Antonia and Mathis, Simon
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Quantitative Biology - Biomolecules ,FOS: Biological sciences ,Biomolecules (q-bio.BM) ,Machine Learning (cs.LG) - Abstract
Pre-trained models have been successful in many protein engineering tasks. Most notably, sequence-based models have achieved state-of-the-art performance on protein fitness prediction while structure-based models have been used experimentally to develop proteins with enhanced functions. However, there is a research gap in comparing structure- and sequence-based methods for predicting protein variants that are better than the wildtype protein. This paper aims to address this gap by conducting a comparative study between the abilities of equivariant graph neural networks (EGNNs) and sequence-based approaches to identify promising amino-acid mutations. The results show that our proposed structural approach achieves a competitive performance to sequence-based methods while being trained on significantly fewer molecules. Additionally, we find that combining assay labelled data with structure pre-trained models yields similar trends as with sequence pre-trained models. Our code and trained models can be found at: https://github.com/semiluna/partIII-amino-acid-prediction., Comment: 4 pages, 2 figures, accepted to the 2023 ICML Workshop on Computational Biology
- Published
- 2023
- Full Text
- View/download PDF