Back to Search
Start Over
Fine-tuning protein language models boosts predictions across diverse tasks
- Source :
- Nature Communications, Vol 15, Iss 1, Pp 1-10 (2024)
- Publication Year :
- 2024
- Publisher :
- Nature Portfolio, 2024.
-
Abstract
- Abstract Prediction methods inputting embeddings from protein language models have reached or even surpassed state-of-the-art performance on many protein prediction tasks. In natural language processing fine-tuning large language models has become the de facto standard. In contrast, most protein language model-based protein predictions do not back-propagate to the language model. Here, we compare the fine-tuning of three state-of-the-art models (ESM2, ProtT5, Ankh) on eight different tasks. Two results stand out. Firstly, task-specific supervised fine-tuning almost always improves downstream predictions. Secondly, parameter-efficient fine-tuning can reach similar improvements consuming substantially fewer resources at up to 4.5-fold acceleration of training over fine-tuning full models. Our results suggest to always try fine-tuning, in particular for problems with small datasets, such as for fitness landscape predictions of a single protein. For ease of adaptability, we provide easy-to-use notebooks to fine-tune all models used during this work for per-protein (pooling) and per-residue prediction tasks.
- Subjects :
- Science
Subjects
Details
- Language :
- English
- ISSN :
- 20411723
- Volume :
- 15
- Issue :
- 1
- Database :
- Directory of Open Access Journals
- Journal :
- Nature Communications
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.6325d3166f6e48c89be154a695c6c202
- Document Type :
- article
- Full Text :
- https://doi.org/10.1038/s41467-024-51844-2