Back to Search Start Over

Interpreting Potts and Transformer Protein Models Through the Lens of Simplified Attention.

Authors :
Bhattacharya N
Thomas N
Rao R
Dauparas J
Koo PK
Baker D
Song YS
Ovchinnikov S
Source :
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing [Pac Symp Biocomput] 2022; Vol. 27, pp. 34-45.
Publication Year :
2022

Abstract

The established approach to unsupervised protein contact prediction estimates coevolving positions using undirected graphical models. This approach trains a Potts model on a Multiple Sequence Alignment. Increasingly large Transformers are being pretrained on unlabeled, unaligned protein sequence databases and showing competitive performance on protein contact prediction. We argue that attention is a principled model of protein interactions, grounded in real properties of protein family data. We introduce an energy-based attention layer, factored attention, which, in a certain limit, recovers a Potts model, and use it to contrast Potts and Transformers. We show that the Transformer leverages hierarchical signal in protein family databases not captured by single-layer models. This raises the exciting possibility for the development of powerful structured models of protein family databases.

Details

Language :
English
ISSN :
2335-6936
Volume :
27
Database :
MEDLINE
Journal :
Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
Publication Type :
Academic Journal
Accession number :
34890134