Back to Search Start Over

Best of both worlds: An expansion of the state of the art pKamodel with data from three industrial partners

Authors :
Fraczkiewicz, Robert
Quoc Nguyen, Huy
Wu, Newton
Kausch‐Busies, Nina
Grimbs, Sergio
Sommer, Kai
ter Laak, Antonius
Günther, Judith
Wagner, Björn
Reutlinger, Michael
Source :
Molecular Informatics; October 2024, Vol. 43 Issue: 10
Publication Year :
2024

Abstract

In a unique collaboration between Simulations Plus and several industrial partners, we were able to develop a new version 11.0 of the previously published in silicopKamodel, S+pKa, with considerably improved prediction accuracy. The model's training set was vastly expanded by large amounts of experimental data obtained from F. Hoffmann‐La Roche AG, Genentech Inc., and the Crop Science division of Bayer AG. The previous v7.0 of S+pKa was trained on data from public sources and the Pharmaceutical division of Bayer AG. The model has shown dramatic improvements in predictive accuracy when externally validated on three new contributor compound sets. Less expected was v11.0’s improvement in prediction on new compounds developed at Bayer Pharma after v7.0 was released (2013–2023), even without contributing additional data to v11.0. We illustrate chemical space coverage by chemistries encountered in the five domains, public and industrial, outline model construction, and discuss factors contributing to model's success. This work is a follow‐up to our previous “Best of Both Worlds” 2015 publication in Journal of Chemical Information and Modeling. It was met with a great interest from the Journal readers enjoying 4771 views and 79 citations to date. Back then, we have described S+pKa ‐ a novel predictive model of ionization constants built from a combined large data sets: one compiled from scientific literature and one obtained from Bayer Pharmaceuticals AG. The S+pKa has been upgraded with three additional large data sets from F. Hoffmann‐La Roche, Genentech, and Bayer CropScience. The present work offers new insights into chemical spaces covered by compounds from all participating companies and public domain, as well as new insights into their influence on model's predictive accuracy.

Details

Language :
English
ISSN :
18681743 and 18681751
Volume :
43
Issue :
10
Database :
Supplemental Index
Journal :
Molecular Informatics
Publication Type :
Periodical
Accession number :
ejs67686113
Full Text :
https://doi.org/10.1002/minf.202400088