Back to Search Start Over

Investigating the Performance of Machine Learning Methods in Predicting Functional Properties of the Hydrogenase Variants.

Authors :
Choi, Gyucheol
Kim, Wonjun
Koo, Jamin
Source :
Biotechnology & Bioprocess Engineering. Feb2023, Vol. 28 Issue 1, p143-151. 9p.
Publication Year :
2023

Abstract

Improving a functional property of an enzyme via mutagenesis is still a challenging problem due to vast search space and difficulty of predicting the effects of mutation(s). Machine learning has proven to be proficient in solving similar problems with unprecedented speed owing to the latest advances in computing power and analytical algorithms. In this study, we investigate the performance of machine learning methods in predicting the H2 production activity and O2 tolerance of the hydrogenase variants. Experimentally measured activities and tolerance of 377 variants having single or double amino acid replacements are used to train and test seven types of machine learning models. Binary representation of amino acid sequence as well as the series of vectors quantifying physicochemical properties of amino acids, namely VHSE, are employed as features representing each variant. The results show that the VHSE enable higher performance, especially with respect to correlation coefficient and coefficient of determination in addition to the root mean square error. Next, the analysis of model performance with respect to changes in the data size and heterogeneity is conducted to provide insights on designing effective mutagenesis library for applying machine learning. The best performance was obtained when support vector machine or ridge regression was trained using a large, homogeneous data. In this manner, our study reveals the factors affecting the performance of machine learning in identifying the enzyme variants with enhanced function. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
12268372
Volume :
28
Issue :
1
Database :
Academic Search Index
Journal :
Biotechnology & Bioprocess Engineering
Publication Type :
Academic Journal
Accession number :
162322550
Full Text :
https://doi.org/10.1007/s12257-022-0330-3