Back to Search Start Over

Prediction of p K a Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines.

Authors :
Lu Y
Anand S
Shirley W
Gedeck P
Kelley BP
Skolnik S
Rodde S
Nguyen M
Lindvall M
Jia W
Source :
Journal of chemical information and modeling [J Chem Inf Model] 2019 Nov 25; Vol. 59 (11), pp. 4706-4719. Date of Electronic Publication: 2019 Nov 05.
Publication Year :
2019

Abstract

The acid-base dissociation constant, p K <subscript>a</subscript> , is a key parameter to define the ionization state of a compound and directly affects its biopharmaceutical profile. In this study, we developed a novel approach for p K <subscript>a</subscript> prediction using rooted topological torsion fingerprints in combination with five machine learning (ML) methods: random forest, partial least squares, extreme gradient boosting, lasso regression, and support vector regression. With a large and diverse set of 14 499 experimental p K <subscript>a</subscript> values, p K <subscript>a</subscript> models were developed for aliphatic amines. The models demonstrated consistently good prediction statistics and were able to generate accurate prospective predictions as validated with an external test set of 726 p K <subscript>a</subscript> values (RMSE 0.45, MAE 0.33, and R <superscript>2</superscript> 0.84 by the top model). The factors that may affect prediction accuracy and model applicability were carefully assessed. The results demonstrated that rooted topological torsion fingerprints coupled with ML methods provide a promising approach for developing accurate p K <subscript>a</subscript> prediction models.

Details

Language :
English
ISSN :
1549-960X
Volume :
59
Issue :
11
Database :
MEDLINE
Journal :
Journal of chemical information and modeling
Publication Type :
Academic Journal
Accession number :
31647238
Full Text :
https://doi.org/10.1021/acs.jcim.9b00498