1. TUnA: an uncertainty-aware transformer model for sequence-based protein-protein interaction prediction.
- Author
-
Ko, Young, Ko, Young, Parkinson, Jonathan, Liu, Cong, Wang, Wei, Ko, Young, Ko, Young, Parkinson, Jonathan, Liu, Cong, and Wang, Wei
- Abstract
Protein-protein interactions (PPIs) are important for many biological processes, but predicting them from sequence data remains challenging. Existing deep learning models often cannot generalize to proteins not present in the training set and do not provide uncertainty estimates for their predictions. To address these limitations, we present TUnA, a Transformer-based uncertainty-aware model for PPI prediction. TUnA uses ESM-2 embeddings with Transformer encoders and incorporates a Spectral-normalized Neural Gaussian Process. TUnA achieves state-of-the-art performance and, importantly, evaluates uncertainty for unseen sequences. We demonstrate that TUnAs uncertainty estimates can effectively identify the most reliable predictions, significantly reducing false positives. This capability is crucial in bridging the gap between computational predictions and experimental validation.
- Published
- 2024