Back to Search
Start Over
Single-sequence protein structure prediction using supervised transformer protein language models.
- Source :
-
Nature computational science [Nat Comput Sci] 2022 Dec; Vol. 2 (12), pp. 804-814. Date of Electronic Publication: 2022 Dec 19. - Publication Year :
- 2022
-
Abstract
- Significant progress has been made in protein structure prediction in recent years. However, it remains challenging for AlphaFold2 and other deep learning-based methods to predict protein structure with single-sequence input. Here we introduce trRosettaX-Single, an automated algorithm for single-sequence protein structure prediction. It incorporates the sequence embedding from a supervised transformer protein language model into a multi-scale network enhanced by knowledge distillation to predict inter-residue two-dimensional geometry, which is then used to reconstruct three-dimensional structures via energy minimization. Benchmark tests show that trRosettaX-Single outperforms AlphaFold2 and RoseTTAFold on orphan proteins and works well on human-designed proteins (with an average template modeling score (TM-score) of 0.79). An experimental test shows that the full trRosettaX-Single pipeline is two times faster than AlphaFold2, using much fewer computing resources (<10%). On 2,000 designed proteins from network hallucination, trRosettaX-Single generates structure models with high confidence. As a demonstration, trRosettaX-Single is applied to missense mutation analysis. These data suggest that trRosettaX-Single may find potential applications in protein design and related studies.<br /> (© 2022. The Author(s), under exclusive licence to Springer Nature America, Inc.)
- Subjects :
- Humans
Distillation
Electric Power Supplies
Language
Algorithms
Benchmarking
Subjects
Details
- Language :
- English
- ISSN :
- 2662-8457
- Volume :
- 2
- Issue :
- 12
- Database :
- MEDLINE
- Journal :
- Nature computational science
- Publication Type :
- Academic Journal
- Accession number :
- 38177395
- Full Text :
- https://doi.org/10.1038/s43588-022-00373-3