Back to Search Start Over

Inference of annealed protein fitness landscapes with AnnealDCA.

Authors :
Luca Sesta
Andrea Pagnani
Jorge Fernandez-de-Cossio-Diaz
Guido Uguzzoni
Source :
PLoS Computational Biology, Vol 20, Iss 2, p e1011812 (2024)
Publication Year :
2024
Publisher :
Public Library of Science (PLoS), 2024.

Abstract

The design of proteins with specific tasks is a major challenge in molecular biology with important diagnostic and therapeutic applications. High-throughput screening methods have been developed to systematically evaluate protein activity, but only a small fraction of possible protein variants can be tested using these techniques. Computational models that explore the sequence space in-silico to identify the fittest molecules for a given function are needed to overcome this limitation. In this article, we propose AnnealDCA, a machine-learning framework to learn the protein fitness landscape from sequencing data derived from a broad range of experiments that use selection and sequencing to quantify protein activity. We demonstrate the effectiveness of our method by applying it to antibody Rep-Seq data of immunized mice and screening experiments, assessing the quality of the fitness landscape reconstructions. Our method can be applied to several experimental cases where a population of protein variants undergoes various rounds of selection and sequencing, without relying on the computation of variants enrichment ratios, and thus can be used even in cases of disjoint sequence samples.

Subjects

Subjects :
Biology (General)
QH301-705.5

Details

Language :
English
ISSN :
1553734X and 15537358
Volume :
20
Issue :
2
Database :
Directory of Open Access Journals
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
edsdoj.748fac7a95b3481d9d0044122031bc14
Document Type :
article
Full Text :
https://doi.org/10.1371/journal.pcbi.1011812&type=printable