Back to Search Start Over

Protein embeddings and deep learning predict binding residues for various ligand classes.

Authors :
Littmann, Maria
Heinzinger, Michael
Dallago, Christian
Weissenow, Konstantin
Rost, Burkhard
Source :
Scientific Reports; 12/13/2021, Vol. 11 Issue 1, p1-15, 15p
Publication Year :
2021

Abstract

One important aspect of protein function is the binding of proteins to ligands, including small molecules, metal ions, and macromolecules such as DNA or RNA. Despite decades of experimental progress many binding sites remain obscure. Here, we proposed bindEmbed21, a method predicting whether a protein residue binds to metal ions, nucleic acids, or small molecules. The Artificial Intelligence (AI)-based method exclusively uses embeddings from the Transformer-based protein Language Model (pLM) ProtT5 as input. Using only single sequences without creating multiple sequence alignments (MSAs), bindEmbed21DL outperformed MSA-based predictions. Combination with homology-based inference increased performance to F1 = 48 ± 3% (95% CI) and MCC = 0.46 ± 0.04 when merging all three ligand classes into one. All results were confirmed by three independent data sets. Focusing on very reliably predicted residues could complement experimental evidence: For the 25% most strongly predicted binding residues, at least 73% were correctly predicted even when ignoring the problem of missing experimental annotations. The new method bindEmbed21 is fast, simple, and broadly applicable—neither using structure nor MSAs. Thereby, it found binding residues in over 42% of all human proteins not otherwise implied in binding and predicted about 6% of all residues as binding to metal ions, nucleic acids, or small molecules. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20452322
Volume :
11
Issue :
1
Database :
Complementary Index
Journal :
Scientific Reports
Publication Type :
Academic Journal
Accession number :
154097346
Full Text :
https://doi.org/10.1038/s41598-021-03431-4