Back to Search Start Over

Data-driven learning of structure augments quantitative prediction of biological responses.

Authors :
Ha, Yuanchi
Ma, Helena R.
Wu, Feilun
Weiss, Andrea
Duncker, Katherine
Xu, Helen Z.
Lu, Jia
Golovsky, Max
Reker, Daniel
You, Lingchong
Source :
PLoS Computational Biology; 6/3/2024, Vol. 20 Issue 6, p1-24, 24p
Publication Year :
2024

Abstract

Multi-factor screenings are commonly used in diverse applications in medicine and bioengineering, including optimizing combination drug treatments and microbiome engineering. Despite the advances in high-throughput technologies, large-scale experiments typically remain prohibitively expensive. Here we introduce a machine learning platform, structure-augmented regression (SAR), that exploits the intrinsic structure of each biological system to learn a high-accuracy model with minimal data requirement. Under different environmental perturbations, each biological system exhibits a unique, structured phenotypic response. This structure can be learned based on limited data and once learned, can constrain subsequent quantitative predictions. We demonstrate that SAR requires significantly fewer data comparing to other existing machine-learning methods to achieve a high prediction accuracy, first on simulated data, then on experimental data of various systems and input dimensions. We then show how a learned structure can guide effective design of new experiments. Our approach has implications for predictive control of biological systems and an integration of machine learning prediction and experimental design. Author summary: Using limited data to predict how biological systems respond to combinations of perturbations is important for better understanding of biological systems and for practical applications. Here we present a new method, structure-augmented regression, that efficiently learns biological response landscapes from sparse measurements. Our method exploits the property that many biological response landscapes have distinct lower-dimensional structures, which can be learned first to assist the subsequent quantitative predictions. We demonstrate that our algorithm outperforms existing algorithms on simulated and experimental data of various biological systems and dimensions. We further exploit the learned structure by suggesting new experiments that refines the learned structure to further improve the prediction accuracy. By integrating machine learning with experimental design, our approach has implications for the predictive control of biological systems. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1553734X
Volume :
20
Issue :
6
Database :
Complementary Index
Journal :
PLoS Computational Biology
Publication Type :
Academic Journal
Accession number :
177634697
Full Text :
https://doi.org/10.1371/journal.pcbi.1012185